There has been a lot of recent work on p-hacking (making things statistically significant through taking advantage of analysis degrees-of-freedom), which I think is good (it’s starting to make people aware of the scope of the problem facing social psychology and related fields); however, I think people are missing something fundamental.
As Tal Yarkoni recently pointed out (and as I pointed out in a previous blog post), the incentives in the academy are messed up. Success in funding, in getting a job, etc, all hinges on your ability to produce positive results. When you livelihood literally depends on getting a positive result, it’s very hard to avoid putting your thumb on that scale.
So the solutions thus far proffered involve things like “publishing your data” and other such controls that will purport to “solve” this problem. However, the deep problem with this can be illustrated with a hypothetical computer program called “the Fake-ulator” (I thought about actually writing this program–but I think the thought experiment is enough for now). Version 1 is just a beta, so it only works for Likert scales. But the idea is simple enough–if we scour the literature for Likert scale data and effects we quickly realize that simple random draws from a response distribution will be easy to spot. Humans have lots of unique biases that lead to systematic patterns in response data like Likert scale data. So, the authors of the Fake-ulator have scoured the literature and have built a random data generator that generates data that looks indistinguishable statistically from real human response data! Better yet, you can input an effect size and generate beautiful (but not too beautiful) data that is statistically significant. You can even generate a fake file drawer, since many of these fake experiments will be “failures”! But hey, since your fake effect is positive, random fake experiments on average will find your effect. So with a computer program like this, you could easily imagine someone faking all of their data in a way that no one would ever notice.
Now what keeps me up at night is, does this computer program already exist? Did we only catch the really dumb fakers who didn’t take the time to do it the right way? One objection might be that anyone smart enough to do this will just run the studies–I think this is wrong. Actually running the studies leaves things up to chance. If you really want a 6-figure tenure track job at Harvard or Princeton, real data just won’t do!
The point of this is just to say that we need more than just clever statistics and safeguards–until we fundamentally change the incentives of science to reward process instead of outcome, we aren’t going to solve this problem. We are only going to make it much harder to determine if something is real or not. The adaptations are already upon us!