Introduction

Chapter 3 introduced the idea that the treatment effect observed in a clinical trial may vary from the true effect because of the two major sources of error - systematic and random error - and emphasized that the random error component can be minimized by randomizing large numbers of patients. The aim of this chapter is to discuss theoretical and practical issues affecting the determination of the ideal sample size for a trial and to describe when and how compromizes from the ideal can be made. While the focus is on randomized trials, we also expand on sample size calculation for the phase II trial designs introduced in Chapter 4. It is beyond the scope of this chapter to detail sample size calculations for all the types of data and circumstances which a researcher may meet. Our aim therefore is to describe ways to determine the appropriate 'input factors' - in particular the size of difference a trial is designed to target - with only brief reference to sample size formulae. We refer readers to published books and software for more detailed options.

Although there is some truth in the statement that any randomized evidence is better than none - in other words that doing a smaller than ideal trial is better than treating patients haphazardly - it is important to be aware of the consequences. A common misconception is to assume that sample size only determines precision, i.e. the width of the confidence interval, and not accuracy or 'closeness' to the true treatment effect. It is important to note that an estimate of treatment effect from a small trial will not necessarily be close to the 'true' treatment effect with the sole disadvantage that it is estimated with uncertainty leading to wide confidence intervals. There will certainly be more uncertainty compared with a larger trial, but in addition there is an increased risk that the estimate will be inaccurate, purely through the play of chance. Note that this does not mean the estimate is biased - repeating many similarly small but properly randomized trials would produce results which, on average, estimate the true underlying effect.

The principle is perhaps best illustrated by considering a sequence of coin tosses. Suppose you were ignorant of the properties of a coin, and wished to estimate the probability that a coin, when tossed, will fall as heads. Clearly this can be estimated by the proportion of tosses that result in heads. Suppose the first toss falls as tails - at this stage your estimate of the probability of heads is 0. The next toss will update your estimate to be 0 still if it falls as tails, or 0.5 if it falls as heads. After three tosses your estimate maybe 0, 0.33 or 0.66. As the number of tosses accumulate, your estimate will move closer to the true value of0.5.

The first estimate is unbiased but, as we know, very far from the truth. An experiment such as this with two possible outcomes is known as a binomial experiment and we know from experience how chance can influence such an experiment, just as we know that as we repeat the tosses of a coin, re-estimating the probability of heads after each one, the estimate will gradually come closer to the truth. In clinical practice of course we are in a position where we never know 'the truth', but an unbiased trial will tend, as the number of patients increases, to give an estimate which comes closer and closer to the truth. This point emphasizes why confidence intervals (CI), which are always important in interpreting the results of a clinical trial, are particularly important for small trials -they define a range with quantified properties. One can say (roughly) that there is a 95 per cent chance that the true effect lies in the range defined by a 95 per cent CI. The interpretation of a single estimate is much more difficult.

0 0