A recent post for the What Works Centre that I thought would be good here too.
At the What Works Centre we’re keen on experiments. As we explain here, when it comes to impact evaluation, experimental and ‘quasi-experimental’ techniques generally stand the best chance of identifying the causal effect of a policy.
Researchers are also keen to experiment on themselves (or their colleagues). Here’s a great example from the Journal of Economic Perspectives, where the editors have conducted a randomised control trial on the academics who peer-review journal submissions.
Journal editors rely on these anonymous referees, who give their time for free, knowing that others will do the same when they submit their own papers. (For younger academics, being chosen to review papers for a top journal also looks good on your CV.)
Of course, this social contract sometimes breaks down. Reviewers are often late or drop out late in the process, but anonymity means that such bad behaviour rarely leaks out. To deal with this, some journals have started paying reviewers. But is that the most effective solution? To find out, Raj Chetty and colleagues conducted a field experiment on 1,500 reviewers at the Journal of Public Economics (where Chetty is an editor). Here’s the abstract:
We evaluate policies to increase prosocial behavior using a field experiment with 1,500 referees at the Journal of Public Economics. We randomly assign referees to four groups: a control group with a six-week deadline to submit a referee report; a group with a four-week deadline; a cash incentive group rewarded with $100 for meeting the four-week deadline; and a social incentive group in which referees were told that their turnaround times would be publicly posted. We obtain four sets of results.
First, shorter deadlines reduce the time referees take to submit reports substantially. Second, cash incentives significantly improve speed, especially in the week before the deadline. Cash payments do not crowd out intrinsic motivation: after the cash treatment ends, referees who received cash incentives are no slower than those in the four-week deadline group. Third, social incentives have smaller but significant effects on review times and are especially effective among tenured professors, who are less sensitive to deadlines and cash incentives. Fourth, all the treatments have little or no effect on rates of agreement to review, quality of reports, or review times at other journals. We conclude that small changes in journals’ policies could substantially expedite peer review at little cost. More generally, price incentives, nudges, and social pressure are effective and complementary methods of increasing pro-social behavior.
What can we take from this?
First, academics respond well to cash incentives. No surprise there, especially as these referees are all economists.
Second, academics respond well to tight deadlines – this may surprise you. One explanation is that many academics overload themselves and find it hard to prioritise. For such an overworked individual, tightening the deadline may do the prioritisation for them.
Third, the threat of public shame also works – especially for better-paid, more senior people with a reputation to protect (and less need to impress journal editors).
Fourth, this experiment highlights some bigger issues in evaluation generally. One is that understanding the logic chain behind your results is just as important as getting the result in the first place. Rather than resorting to conjecture, it’s important to design your experiment so you can work out what is driving the result. In many cases, researchers can use mixed methods – interviews or participant observation – to help do this. Another is that context matters. I suspect that some of these results are driven by the power of the journal in question: for economists the JPubE is a top international journal, and many researchers would jump at the chance to help out the editor. A less prestigious publication might have more trouble getting these tools to work. It’s also possible that academics in other fields would respond differently to these treatments. In the jargon, we need to think carefully about the ‘external validity’ of this trial. In this case, further experiments – on sociologists or biochemists, say – would build our understanding of what’s most effective where.
A version of this post originally appeared on the What Works Centre for Local Economic Growth blog.