## Differences and relationship between significance testing and estimation

The differences and relationship between significance testing and estimation are best shown using hypothetical examples.

The three rows in Table 9.1 shows the results of three hypothetical randomized trials, all estimating the same underlying true difference between treatments. The three trials differ only in terms of their size, with respectively 200, 2000 and 20,000 patients randomized. Although, because of the play of chance, in practice not all of these trials would produce exactly the same observed difference, we have assumed this for the purposes of this illustration. It can be seen that the first trial with 200 patients randomized has a corresponding p-value of 0.46. This trial maybe reported to be a 'negative trial - showing no difference between the treatments.' This is of course wrong - the trial actually contains insufficient information, and is more correctly reported as an inconclusive trial. This can be seen by the width of the 95 per cent confidence interval, which shows that the difference between treatments could still be large in favour of either treatment, either 9 per cent in favour of the control treatment or 17 per cent in favour of the experimental treatment. The second trial with 2000 patients randomized has a p-value of 0.02, with a confidence interval ranging from 1 to 9 per cent. This p-value may be considered conventionally significant, in that it is less than 0.05. This is reflected in the 95 per cent confidence interval which excludes the value of no difference, i.e. zero. This displays the relationship between the significance level and confidence intervals - a p-value of exactly 0.05 will mean that the 95 per cent confidence interval just includes the value zero. Similarly a p-value of 0.01 will mean that the 99 per cent confidence interval will just include the value zero. In general a p-value of p will mean that the 100(1 — p) per cent confidence interval will just include zero. This 2000 patient trial provides us with more persuasive evidence in favour of the experimental treatment, the p-value is encouraging. However, the 95 per cent confidence interval is still wide ranging from 1 to 9 per cent in favour of the experimental treatment. Thus, this 2000 patient trial although encouraging still has some uncertainty about the size of the effect. The 20,000 patient trial has a p-value of less than 0.0001, and a 95 per cent confidence interval of (4%, 6%). This trial therefore provides extremely good evidence that there is a difference between the treatments, and the 95 per cent confidence interval is sufficiently narrow to suggest that the difference

 Number of Estimated 95% Confidence p-value patients difference Interval 200 5% (—9%,17%) 0.46 2000 5% (1%, 9%) 0.02 20,000 5% (4%, 6%) <0.0001

has been estimated with reasonable accuracy. It is probably only this trial that allows a definitive statement to be made, in favour of the experimental treatment.

0 0