## Comparing means an example

Suppose we wish to design a study in which the aim is to compare glomerular filtration rate (GFR) in two groups of ovarian cancer patients, one receiving carboplatin-based chemotherapy and one receiving cisplatin-based therapy. The GFR follows an approximate Normal distribution, and so it would be appropriate to summarize the GFR by the group means, and to compare the groups using a two-sample t-test (see also Section 9.3.3). Assume the GFR in the carboplatin group is 100ml/min one year after start of chemotherapy. We wish to determine the sample size needed to detect a decrease in GFR of 10 ml/min in the cisplatin group at the same time point. From equation 5.2, we need not only an estimate of the difference in GFR, but also an estimate of the standard deviation in each group (here assumed to be the same). This can be taken from previous data or, as an approximate rule, one can divide the range of possible values of a Normally distributed variable by four to get an estimate of the SD, as 95 per cent of the values are contained in the range mean ±2SD.

Alternatively, it is sometimes easier to think not of the differences of the means and the SD, but of the ratio of the difference in means to the SD. For example, you may wish to detect a difference that is equivalent to half a SD. As the denominator in equation 5.2 is the square of the ratio of the difference in means to the SD, either approach can be taken. In this case, let us assume we are interested in determining whether cisplatin-based therapy decreases GFR at one year by half a SD with 90 per cent power and a 2-sided significance level of 5 per cent. The ratio (S2/S2) then is equal to 0.52 and from equation 5.2 and Table 5.2 we see that the number of patients required in each group is (2 x 10.507)/0.25 = 84.

Strictly, when using the two-sample t-test to compare the groups, we need to make an adjustment to equation 5.2 adding an extra term as below as:

This makes a very small difference (for a 5 per cent significance level the additional term is approximately equal to 1) but one which is proportionately bigger for very small studies than for very large studies. It helps us to account for the fact that small samples are less likely to follow a true Normal distribution.