“In medical studies investigators are usually interested in determining the size of difference of a measured outcome between groups, rather than a simple indication of whether or not it is statistically significant.”
Gardner MJ, Altman DG. Confidence intervals rather than p values: estimation rather than hypothesis testing. BMJ 1986;292(6522):746-50.


To summarize from last month, p-values tell us whether we have statistically significant results. So why isn’t this enough? Statistical significance and clinical importance are 2 separate judgments. A large study may be able to detect a statistically significant difference that is small and clinically unimportant; conversely, a small study may not be able to detect a small difference that is clinically important. In short, tunnel vision on the p-value is not really helpful and encourages LAZY thinking!


A confidence interval (CI) provides a range for our best guess of the size of the true treatment effect that is plausible given the size of the difference actually observed. If CIs were to be constructed from 100 samples of the same size from the same population, 95 of them will contain the true treatment effect, 5 would not. We would be 95% confident that if the true treatment effect were known, the CI would contain it. Since confidence attaches to the interval and not to the true effect, it is incorrect to say that “there is a 95% probability that the true effect is within the CI.” This is because the true effect (which is not known to us and is the reason we are doing the study in the first place) either falls in a particular interval or it does not. That is, its probability for being in the CI is either 100% or 0%; it is not 95%. Finally, note that CIs can be constructed for any number of probabilities, although most investigators report the 90%, 95%, and 99% CIs.

 

Next month, we will continue our discussion of the confidence interval.