Tuesday, January 16, 2018

"The problem with p-values"



"The problem of how to distinguish a genuine observation from random chance is a very old one. It’s been debated for centuries by philosophers and, more fruitfully, by statisticians. It turns on the distinction between induction and deduction. Science is an exercise in inductive reasoning: we are making observations and trying to infer general rules from them. Induction can never be certain. In contrast, deductive reasoning is easier: you deduce what you would expect to observe if some general rule were true and then compare it with what you actually see. The problem is that, for a scientist, deductive arguments don’t directly answer the question that you want to ask... 

Confusion between these two quite different probabilities lies at the heart of why p-values are so often misinterpreted. It’s called the error of the transposed conditional. Even quite respectable sources will tell you that the p-value is the probability that your observations occurred by chance. And that is plain wrong... 

the dichotomy between ‘significant’ and ‘not significant’ is absurd. There’s obviously very little difference between the implication of a p-value of 4.7 per cent and of 5.3 per cent, yet the former has come to be regarded as success and the latter as failure. And ‘success’ will get your work published, even in the most prestigious journals. That’s bad enough, but the real killer is that, if you observe a ‘just significant’ result, say P = 0.047 (4.7 per cent) in a single test, and claim to have made a discovery, the chance that you are wrong is at least 26 per cent, and could easily be more than 80 per cent."



FB: "it’s of little use to say that your observations would be rare if there were no real difference between the pills (which is what the p-value tells you), unless you can say whether or not the observations would also be rare when there is a true difference between the pills"

No comments:

Post a Comment