Psience Quest

Full Version: Statistical tests: one-tailed versus two-tailed
You're currently viewing a stripped down version of our content. View the full version with proper formatting.


I was a bit taken aback to read in a paper by Schooler et al., which was part of the recent discussion of precognition in the journal Psychology of Consciousness, this bald statement in the introduction to the outline of existing experimental evidence:
"One-tailed p values were converted to two-tailed."

(When a psi process is expected to increase an experimentally measured quantity - such as the number of correct guesses - a one-tailed p value is the probability of that quantity being at least as big as the observed value by pure chance, on the assumption that psi doesn't exist. In contrast, the two-tailed p value is the probability of the quantity being at least as far from chance expectation as the observed value, including the possibility that it will be much lower than chance expectation. So a two-tailed test would count some instances of "psi-missing" as significant results, but to make up for that it would not count as significant some of the weaker results in the expected direction.)

Now I know that sceptics tend to be keen on two-tailed tests, because most published experimental results are in the expected direction, so two-tailed tests reduce the number of significant results.

But I wonder if anyone can come up with a really valid reason why two-tailed tests are preferable, if there is an a priori expectation that the effect will act in a particular direction.

I must admit that at one stage I preferred two-tailed tests myself, but that was based on the rather woolly argument that what was being tested was the null hypothesis, and that had no preferred direction. Now I don't think that argument holds water. On the whole one-tailed tests seem more appropriate if there is a preferred direction. But it's perhaps partly a matter of taste. The only really vital thing is that the form of the test must be decided before the data have been seen - which is why I'm particularly surprised that Schooler et al. would make a post hoc alteration to the reported results they are presenting.