How do you evaluate the success of a private school voucher program? There are two excellent new articles in the Peabody Journal of Education that offer two very different answers to this question. First, Anna Egalite and Patrick Wolf review the literature on student level voucher impacts, mostly on test scores. Particular attention is focused on the “gold-standard” random student assignment studies. Christopher Lubienski and Jameson Brewer are critical of the focus on “gold-standard” studies and conclude that no such research consensus exists.
I am an academic that walks somewhat uncomfortably in both the education policy and public management research worlds. Personally I think this gives me a unique perspective (though more than one journal editor and hiring committee have disagreed!). Unique or not, the perspective draws me to this debate. The best part about public administration is the difficulty in defining the public interest. Doing what is best for the public, be it as a manger, frontline employee, or an academic dreaming up new governance paradigms, involves ambiguities and tradeoffs. As such, a central question for me when considering the success of a voucher program is the economy of force question: are we (the royal we) minimizing the energy spent on secondary objectives?
A random assignment experiment cannot answer this question. An experiment showing gains for voucher users must be interpreted in its proper context. Is the program stable? Is it the subject of political infighting? Are the goals of the program clearly defined and broadly accepted? What are the opportunity costs? What are the fiscal implications? Is the program accountable to voters? What exactly does accountability mean in the eyes of program users and those paying the bill? What is the public’s tolerance for school failures? I could go on and on with these broad questions, and come up with many more in each specific context.
I am not saying the random assignment studies do not matter. They are an important part of evaluating the impacts of voucher policies. But there is so much more nuance involved. For example, if you had two cities with identical test score gains for voucher users, it is entirely possible for one program to be a success and one program to be a failure. One program might be politically stable and accepted by the community, while the other may be the subject of constant counter-productive political battles. One program might be meeting original performance expectations, while the other program is not. My point is that context, unity of purpose, nuance, expectations, and public acceptability all matter in the voucher performance debate. Given this, it is really hard for me to conclude that a research consensus exists, or even can exist.