2016's top 100 journal articles Community Quality of science and science communication

Statisticians respond to misuse and misinterpretation of “statistical significance” (p-values) in research

Bruce Boyes6 Jan 2017

1,514 2 minutes read

Editor’s note: This article was first published on 8 March 2016. It was republished on 6 January 2017 to become part 7 of the special series Top 100 most-discussed journal articles of 2016.

Amid rising concerns about the reproducibility and replicability of scientific conclusions, the American Statistical Association (ASA) has released a formal statement¹ clarifying several widely agreed upon principles underlying the proper use and interpretation of the p-value:

Underpinning many published scientific conclusions is the concept of “statistical significance,” typically assessed with an index called the p-value. While the p-value can be a useful statistical measure, it is commonly misused and misinterpreted.

The six principles are:

P-values can indicate how incompatible the data are with a specified statistical model.
P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
Proper inference requires full reporting and transparency.
A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

Research monitor Retraction Watch has interviewed ASA executive director and statement co-author Ron Wasserstein in regard to the new principles.

Wasserstein says that the biggest mistakes being made include the misuse of statistical significance as an arbiter of scientific validity, concluding that a null hypothesis is true because a computed p-value is large, and the logical fallacy of concluding something is true that you had to assume to be true in order to reach that conclusion.

The latter error relates to principle 2, which addresses a widespread misconception in regard to p-values. The Retraction Watch interviewer asks Wasserstein:

Some of the principles seem straightforward, but I was curious about #2 – I often hear people describe the purpose of a p value as a way to estimate the probability the data were produced by random chance alone. Why is that a false belief?

Wasserstein responds:

Let’s think about what that statement would mean for a simplistic example. Suppose a new treatment for a serious disease is alleged to work better than the current treatment. We test the claim by matching 5 pairs of similarly ill patients and randomly assigning one to the current and one to the new treatment in each pair. The null hypothesis is that the new treatment and the old each have a 50-50 chance of producing the better outcome for any pair. If that’s true, the probability the new treatment will win for all five pairs is (½)5 = 1/32, or about 0.03. If the data show that the new treatment does produce a better outcome for all 5 pairs, the p-value is 0.03. It represents the probability of that result, under the assumption that the new and old treatments are equally likely to win. It is not the probability the new treatment and the old treatment are equally likely to win.

Reference:

Wasserstein, R.L. & Lazar, N.A. (2016). The ASA’s statement on p-values: context, process, and purpose. The American Statistician, DOI:10.1080/00031305.2016.1154108 ↩

5/5 - (1 vote)

Also published on Medium.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Bruce Boyes

Related Articles

Project uses peer pressure to encourage open research

Open access trends: the good, the bad, and the irony

What works or could work in knowledge brokering to enhance the use of research for inclusive global development?

Collapse of the Easter Island ecocide theory: to what extent does opinion influence research?