What does it mean to be “data-driven”?
- Steve Nuzum
- Aug 18
- 8 min read
BY STEVE NUZUM
Pandemic-era school policies are a useful example of how schools and districts often “use data."
During the Covid-19 pandemic, the experiences of Americans varied widely. Some people never got seriously ill or knew anyone who did. Some people experienced or witnessed unprecedented tragedies.
I had a large number of students and colleagues who were seriously ill after infection, and/ or who lost loved ones to the virus. Several teachers in my district died from acute infection. Two of my students, siblings, lost both parents in the early weeks of the pandemic.
The pandemic coincided with what seems to be a significant low in American trust in science and scientific data. I think the reason for this is that people with vastly different experiences and different motivations were left with a small amount of hard factual information and a vested interest in seeing the pandemic in very different ways.
South Carolina Governor Henry McMaster pushed early for a full return to in-person instruction across the state. A physician and representative of the local chapter of the American Academy of Pediatrics testified at a hearing early in the pandemic that she believed children were relatively safe-- based on then-available data-- to return to school. At this point, there were no large-scale, high-quality studies of the new virus or its transmission, and even studies that did trace transmission in one variant of the virus didn’t necessarily predict how the virus would spread as it mutated.
But over time, as more and more large-scale studies painted a more complex picture of how the virus was spread, the governor and others continued to cite that testimony suggesting kids didn’t get the virus, long after the doctor who had testified, and the American Academy of Pediatrics, had both updated their recommendations based on new data, encouraging far more caution, including masking for students and staff, and school closure policies based on local disease data.
McMaster continued to oppose mask mandates over the advice of health experts, and to downplay the importance of masks for students, even as childhood hospitalizations for COVID went up in the state.
My school district was one of the last in South Carolina to go back to fully in-person instruction-- probably because many of the zip codes we served were among the hardest-hit in the state by the virus. As state legislators moved to force the district to reopen, the case went all the way to the state Supreme Court, which ultimately held that the state had the power to force a return.
During this time, the district superintendent vowed to follow the science and to tie decisions about when and how to reopen schools to physical instruction to rates of infection in the community.
Some zip codes in that district continued to frequently lead the state in new cases of the virus, with some of the highest rates of hospitalization and death, according to the Department of Health (then DHEC) at the time.
I created a spreadsheet to try to help visualize the data, which is available here. (I am not a health expert or statician; this spreadsheet, which I updated and shared with the district every week, was only intended to help inform district policy that was supposedly based on local disease data.)
But the district’s policies ultimately had little objective connection to the data. For better or for worse (and there are many who would argue for both positions), we returned to in-person instruction after a brief dip in COVID numbers, but we never returned to remote instruction, or implemented other specific mitigation strategies, even when we again saw some of the worst numbers in the state.
During the same time, a survey of parents and staff in the district showed that most participants wanted to remain in remote instruction while cases were high. The district did not conduct further surveys.
In other words, politics or other considerations drove most of these decisions, rather than data (even data about parent preferences). Whether any decision ultimately proved to be effective or ineffective, we didn’t follow through on the districts’ vow to base policy on the number of cases. We were not, objectively speaking, data-driven.
But for teachers, this inconsistent approach probably wasn’t much of a surprise.
One of the few constants throughout my teaching career was the insistence that education should be “data-driven”. One of the few other constants was that this term almost never matched up with what academic researchers mean when they use the same term.
When scientific or academic researchers say “data driven,” they generally mean they have asked a question (usually related to a testable theory) and are seeking out any relevant data that can help to answer that question (usually in the form of supporting, debunking, or modifying the original theory).
In health research, for example, effective researchers would set out to test whether a medication was effective at treating a given condition, and then would conduct something like a random controlled trial (RTC), where two groups of people, each chosen “randomly” in a way that aims to statistically represent the general population to be treated, receive either the medication or a placebo. The data, then, is derived entirely from the process of trying to answer a question which (ideally) has no right answer. (Of course, many medical studies are financed by pharmaceutical companies, and good researchers have to work hard to disentagle any financial or other motives that could lead them to cherry pick data to get to a preferred solution.)
But in my experience “data-driven” school policies are rarely based on either a primary research process or the effective use of existing research.
When state and district leaders, and many school leaders and administrators, say “data-driven,” they often mean one of two things:
They have decided upon a course of action, and then have acquired some kind of numbers or other seemingly quantifiable information to support that course of action. (If they have ignored other information that might undermine their argument, that is called “cherry-picking” the data.)
They have been told by a third party “expert” (often a corporate consultant or other self-declared professional development guru) that the data supports a course of action.
For example, I once worked in a large urban/ suburban school district which paid an unknown (but apparently very large) amount of money for a bank of “benchmark” test questions and an accompanying interface for assigning students regular “benchmark” assessments. When I spoke with one school board member about the program, he told me the company which created the test bank had made a presentation to the board claiming that student benchmark scores were correlated with scores on state assessments, and that a neighboring district had seen its test scores go up after adopting the benchmark assessments.
When I spoke with the district superintendent, he shared similar claims, and made the argument that using the standardized tests was a measure to support “equity” because all students would be given identical assessments, free of teacher biases. (He did not provide a data-supported rationale for believing that the tests were somehow less biased than teachers, and in fact many studies have found systemic biases in many standardized tests. The SC Education Oversight Committee found that at least one version of the state English test, for example, contained statistically significant racial biases. The benchmarks, in turn, were supposedly modeled on state tests.)
The implicit claim is that the benchmarks made students better at the skills represented by the state tests. But there are a long list of assumptions that have to be true in order for this conclusion to be true:
The apparent connection between scores on the benchmark tests and scores on the state tests has to actually be causal. (One of my favorite illustrations of this is Tyler Vigen’s “Spurious Correlations” website, which provides real data showing, for example, that the “Popularity of the ‘distracted boyfriend’ meme” seems to rise and fall in almost exactly the same way as the amount of “Hydropower energy generated in Turkmenistan”. Correlation does not necessarily equal causation.)
Other variables have to be less significant than the use of the benchmark. (For example, I was often praised by district staff because my students had some of the highest benchmark and state test scores in the district. However, students in my classes tended to be honors students and magnet students; both groups of students were selected because they typically performed well on standardized tests.)
The connection between state test scores and the skills we’re trying to measure has to be real. (It’s possible that the benchmark predicts that students will do well on the state test because the two types of test are similar, and that the state test also does not do a good job of measuring what we want it to measure. As the Education Oversight Committee found, at least some of our tests were sometimes inconsistent in how well they measured the skills of students based on their race.)
And of course that’s only a small list of the statistical issues involved in the case that was apparently made to the school board.
I’m not aware of any large-level analysis of whether benchmark data did actually predict state test scores (and most statistical analyses of standardized tests come from the testing companies, themselves) but I saw firsthand a greater and greater push to use the benchmark assessments in class. (Often, when I was praised for my students’ scores on benchmarks and state tests, I would point out that I actually used the benchmarks as little as possible, and probably less than my average colleague, but district administrators never seemed interested in following up on whether that correlation might be important-- in other words, if less benchmarking improved test scores, the district, after spending taxpayer money on the program, wasn’t keen to know.)
Teachers are often not supported or equipped to use data like researchers.
In my entire career, while I had some excellent content coordinators and administrators, I never saw anyone mentor classroom teachers on effective ways to use data to guide instruction in ways that make statistical sense. (I did see a lot of “unpacking” test data, but generally in ways that made many of the assumptions I listed above; never did I observe a district or school empower its employees to ask real questions about their instruction or about student needs and then authentically analyze factual information to answer those questions in ways that benefited students.)
On the other hand, I saw teachers constantly incentivized to use data like politicians, to tell a preordained story, or to jump through the hoops of “student learning objectives”-- a good idea in theory that in practice usually boiled down to pretending to see a connection between some classroom intervention and end-of-the-year test scores.
As some state leaders float performance pay, I’m reminded of this approach to “data”. The story we want to tell is that good teachers, who are worth more money, create a specific output: better student test scores. This follows the same logic as our pandemic-era reliance on “data”-- but only as long as that “data” told us that what we wanted to do, or what we had already done, was best for kids.
And while there is a mixed research basis for the idea that performance pay is correlated with better test scores, this is only something worth incentivizing if we assume teaching to the test is a good thing. Studies also suggest that performance pay results in even more “effective” teachers leaving poor districts for wealthier ones, which now can pay them more-- and where, statistically, test scores are likely to be higher, anyway.
Real data is a description of the real world.
It is crucial that we use what we know about the world to inform how we teach and how we run schools. Unfortunately, fake data and fake analysis of real data have done a lot of damage to the trust teachers and other decision-makers place in the research process. Just as incomplete information was weaponized for political purposes during the pandemic, it has been weaponized for political purposes in the relatively less intense domain of instructional design.
But if our goal is to provide students with meaningful skills and information to help them to live successful lives, we can’t keep relying on a fantasy version of data where every test score has predictive validity, or where we allow ourselves to freeze decision-making at the moment researchers seem to agree with us, only to ignore those researchers later on when it becomes inconvenient to use current facts.
Comments