PmWikiFisica | AnalisiDati / Significativo browse

< Appendici | Indice | AnalisiDati.Calendario >

Questa discussione è significativa o è statisticamente pedante?

From Nature 455, 1023-1028 (2008)

Significant (adjective.) Geoff Brumfiel

Few words in the scientific lexicon are as confusing, or as loaded, as 'significant'. Statisticians wring their hands over its cavalier use to describe scientific validity. And backed by statistics or not, researchers commonly employ the word to illustrate the importance of their latest finding.

The very definition of statistical significance is misunderstood by most scientists, says Steven Goodman, a biostatistician at the Johns Hopkins School of Medicine in Baltimore, Maryland, and associate editor on Annals of Internal Medicine. Typically, researchers take a result to be statistically significant based on 'p-values'. This parameter is used, for example, to reveal whether a drug lowers cholesterol based on promising data collected in a clinical trial.

According to the common interpretation, a 'significant' result with a p-value of 0.05 or less means that there is a 5% or less chance that the drug is ineffective. According to the statistically accurate definition, there is a 5% or less chance of seeing the observed data even though the drug is, indeed, ineffective. Rhetorically, the difference may seem imperceptible; mathematically, say statisticians, it is crucial. In situations in which the data is somewhat ambiguous, there is a chance that results can be misinterpreted. "It's diabolically tricky," Goodman says.

Most statisticians resign themselves to abuse of the term's strict definition. But more grievous trespasses abound. "Statistical significance is neither a necessary nor a sufficient condition for proving a scientific result," says Stephen Ziliak, an economist at Roosevelt University in Chicago, Illinois, and co-author of The Cult of Statistical Significance. P-values are often used to emphasize the certainty of data, but they are only a passive read-out of a statistical test and do not take into account how well an experiment was designed. A p-value would not reveal, for example, that everyone was taking different doses of that cholesterol drug. In many experiments, Ziliak says, "there are so many different errors that they tend to swamp the p-value errors".

Even if a result is a genuinely statistically significant one, it can be virtually meaningless in the real world. A new cancer treatment may 'significantly' extend life by a month, but many terminally ill patients would not consider that outcome significant. A scientific finding may be 'significant' without having any major impact on a field; conversely, the significance of a discovery might not become apparent until years after it is made. "One has to reserve for history the judgement of whether something is significant with a capital S," says Steven Block, a biophysicist at Stanford University in California.

In some situations other statistical methods can substitute, but Goodman believes that trying to use them in the scientific literature would be like "talking Swahili in Louisiana". He says he and other editors do their best to keep the term out of Annals though. "We ask them to use words like 'statistically detectable' or 'statistically discernable,'" he says.

Comment ... Compare, for example, the statement "The observed differences could occur 5% of the time if the true effect is zero" with the statement "The probability that the true effect is zero is 5%". Not only is the latter statement wrong, it does not match the scientific question, which should be to estimate, at a given probability, the minimum size of the effect. ... R. Allan Reese1 Centre for Environment, Fisheries and Aquaculture Science, The Nothe, Weymouth DT4 8UB, UK

< Appendici | Indice | AnalisiDati.Calendario >