## Y

Figure 2.2

Comparison of (a) normal and (b) asymmetric (skewed) distribution of data

Because the true standard deviation of the population (a) can be determined only if a very large number of measurements are made, an estimate of the standard deviation of an analytical method is usually made with a much smaller set of measurements that constitutes a sample of the all the measurements that could be made. In this case the standard deviation of the sample is given by:

If the number of measurements is small then the standard deviation will be underestimated if eq 2.5 is used instead of eq. 2.6.

Figure 2.3

Relationship between normal distribution of data and standard deviation

Figure 2.3

Relationship between normal distribution of data and standard deviation

### 2.2 Validation parameters

The primary statistical parameters that validate an analytical method are the accuracy and precision. Although the validity of experimental data may be defined primarily by the accuracy and precision of the analytical method used to generate those data, the use of these parameters alone is generally considered inadequate and supplementary experiments are necessary for the validation of a new method to be complete.

The following sections summarize the various parameters that have been described for the validation of quantitative analytical methods. The procedures used for the validation of qualitative methods are generally less involved and are usually concerned mainly with the establishment of selectivity or specificity and ruggedness.

### 2.2.1 Accuracy and precision

The precision and accuracy of analytical methods are described in a quantitative fashion by the use of relative errors. One example of relative error is the accuracy (eq. 2.2), which describes the deviation from the expected result. The relative error term usually used to describe precision is the relative standard deviation (RSD):1

The precision of the analytical system and the precision of the method are generally defined separately. This is because the former provides information on the errors associated with the instrumentation and the latter provides information on the complete method. The difference between the two ordinarily arises from errors associated with sample preparation. For example in liquid chromatography, the system precision may be determined from the RSD of repetitive injections (n=5 or 6) of the same solution. In contrast to the system precision, the method precision is determined by repetitive analysis (n=5 or 6) of a single homogeneous sample. For

1 The relative standard deviation is also known as the coefficient of variation (CV).

example the system RSD for an LC method may be assumed to be a function of the random errors arising from the column, the injector, the detector and the integration device. Reasonable estimates of the RSDs attributable to these components might be:

RSDinjector = 0.5%

RSDdetector = 0.3%

RSDintegrator = 0.1% in which case the system RSD is given by:

The error attributable to sample preparation will vary considerably depending on the number of steps, the complexity of each step and the concentration of the analyte of interest. For a simple LC method for the determination of the purity of a drug substance, the sample preparation might be relatively straightforward involving weighing the drug, dissolving in a suitable solvent and adjusting to volume. In this case, the RSD for the sample preparation step (RSDprep) might be approximately 1%. The method RSD is then given t»y:

RSDmcthod = Vo.l2 + 0.52 + 0.32 + 0.12 + 1.02 (2 9)

Equations 2.8 and 2.9 illustrate a very important point that the overall random error associated with a particular determination or method is dominated by the least precise step or component, so measures designed to improve the precision should always be directed towards improving the step or component having the highest degree error. Thus, for the hypothetical example shown here, the system precision in LC is governed by the precision of the injection device and the method precision is governed by the complexity of the sample preparation procedure.

The distinction may be made between the within-run precision (also refered to as the within-day or intra-laboratory precision) and the between-day (between-run or inter-laboratory) precision of an analytical method. These two terms are sometimes referred to as the repeatability and the reproducibility of the method, respectively.

Figure 2.4

Within-run and between-run precision for the assay of (R)- (circles) and (S)-N-demethyl-dimethindene (squares) in urine by capillary electrophoresis. Data taken from ref. [8]

### Figure 2.4

Within-run and between-run precision for the assay of (R)- (circles) and (S)-N-demethyl-dimethindene (squares) in urine by capillary electrophoresis. Data taken from ref. [8]

Generally the value of the within-run RSD of a method is less than the value of between-run RSD. For example, Fig. 2.4 shows the relationship between the between-run and within-run RSDs for the analysis of N-demethyl-dimethindene in urine by capillary electrophoresis [8]. Figure 2.4 also shows an important feature of chromatographic and related methods of analysis: the precision generally decreases with decreasing analyte concentration, reaching unacceptable levels as the measured signal approaches the noise inherent in the system. However, the precision of a method does not always decrease with increasing concentration. For example, the highest precision of receptor binding assays is generally obtained at intermediate concentrations and decreases at higher and at lower concentrations (e.g. Fig. 2.5).

The fact that the between-run precision of an assay is generally not as good as the within-run precision, at a given concentration, arises from the increased number of steps needed to run an assay on consecutive days compared with the number of steps needed for a single day (e.g. daily preparation of standards, reagents etc.).

Figure 2.5

Within-run precision of the assay of thyroid-simulating hormone by electrochemical enzyme immunoassay. Data taken from ref. [9]

Figure 2.5

Within-run precision of the assay of thyroid-simulating hormone by electrochemical enzyme immunoassay. Data taken from ref. [9]

Some analysts have chosen to use the RSD of the slope of the daily calibration curve (see Sec. 2.2.6) as a measure of the between-day precision of an analytical method. This is inappropriate because the slope of the calibration curve is the parameter used to correct for the day-to-day variation in response factor and its RSD only provides an indirect indication of the reproducibility of the method. On the other hand, the day-to-day variability of the response factor can be taken as a measure of the ruggedness of the method (see Sec. 2.2.7).

Spiked

Concentration

Spiked

Concentration

Analyst

### Figure 2.6

Results of six replicate assays conducted by four different analysts on urine spiked with a drug at a concentration of 25 ng/mL. This figure illustrates the four possible results: (1) accurate but imprecise, (2) inaccurate and imprecise, (3) accurate and precise, (4) inaccurate but precise

In addition to measuring the within-day and between-day variability, the concepts of accuracy and precision may be used to define the ruggedness or robustness of a method (see also Sec. 2.7). For example, Fig. 2.6 shows the hypothetical results of an experiment designed to determine the effects of different analysts on the determination of a drug spiked into urine at a known (true) concentration of 25 ng/mL. Each analyst demonstrates one of the four possible outcomes: analyst 1, accurate but imprecise; analyst 2, inaccurate and imprecise; analyst 3, accurate and precise; and analyst 4, inaccurate but precise.

It is important to note that increasing the number of measurements on the sample does not necessarily decrease the value of the measured standard deviation. However, as the sample size decreases, so does the uncertainty introduced in using s to estimate the true (population) standard deviation of the method, o. To allow for this, the confidence limits for sample mean are given by:

where the term s/Vn is defined as the standard error of the mean and the values of t may be obtained from statistical tables.

### 2.2.2 Calibration

An important step in the validation of any analytical method is the establishment of the mathematical relationship between the measured response (yi) and the concentration of the analyte (Ci=xi). Once the mathematical relationship has been established the analytical instrument or method may be calibrated. The calibration procedure will depend upon the type of method, whether the method is instrumental or non-instrumental, the type of sample, the degree of accuracy and precision required and the concentration range of the analyte or analytes of interest. For the purposes of this discussion, the types of response-concentration functions commonly experienced in pharmaceutical and biomedical analysis have been conveniently divided into those that are linear and those that are non-linear.

### 2.2.2.1 Linear response functions

The most convenient response function (Fig. 2.7) is one in which the measured quantity (yi) is linearly related to the concentration (Ci=xi) according to the eq.

The slope (b) and intercept (a) coefficients are given by eqs. 2.12 and 2.13, respectively:

where x and y are the means of the measured responses (yi) and the concentrations (xi), respectively. Figure 2.7 shows the results of a hypothetical experiment to establish the relationship between the peak height obtained by a technique such as liquid chromatography and the concentration of analyte injected. This figure also shows that the relationship between the measured response and the analyte concentration obtained by unweighted least-squares linear regression analysis, must pass through the centroid (x,y) and the intercept (b). Most hand-held calculators and simple computer programs, such as CricketGraph® or DeltaGraph®, readily allow the calculation of the slope and intercept as well as the correlation coefficient (r), which is a measure of the goodness of fit of the data (eq. 2.14). A value of+1 for r indicates perfect correlation and a positive value of the slope (b) and a value of -1 for r indicates perfect correlation and a negative value of the slope (b). A value of 0 for r indictaes no linear correlation between x and y.

If necessary, the statistical significance of the correlation coefficient may be determined using a two-tailed t-test.

4 6 8 10 Concentration (ng/mL)
4 6 8 10 Concentration (ng/mL)

Figure 2.7

Regression analysis of representative calibration data showing 95% and 99% confidence intervals (upper) and analysis of residuals (lower)

The value of r obtained for the least-squares linear regression analysis of the sample data in Fig. 2.7 is 0.994. This corresponds to a t-value of 29.05, which is highly significant (P<0.01). Although the value of the correlation coefficient, r, is readily obtained, it should not be relied on to establish the linearity of the calibration data per se, because calibration data that have a high degree of curvature can give high values of r (>0.99) and statistically significant values of t. Instead, other approaches should be used to establish the linearity of the response function. The most useful approach to determine whether the calibration relationship is linear is to analyze the residuals (5y,x, eq. 2.15) or the difference between the observed (yi) and the predicted value of the measured response (yi), which should be randomly distributed around a value of zero when plotted against xi (Fig. 2.7). Any curvature in the data is accentuated by analysis of the residuals.

If the analysis of the residuals (eq. 2.15) suggests curvature in the relationship between response and concentration, a non-linear approach should be considered. In addition to the calculation of residuals, the correlation coefficient (eq. 2.14) and its t-value, a more complete analysis of the data involves the calculation of the standard errors of the slope and intercept, which may be obtained from eqs.

where:

and sb is the standard error of the slope (b), sa is the standard error of the intercept, y is the value calculated from the fitted line and y; - y are the residuals. For a perfect linear correlation between x and y, the values of each of the residuals will be 0. The confidence interval for the intercept and the slope may also be calculated (eqs. 2.19 and 2.20, respectively) and their statistical significance determined using a t-test

An alternative approach to determining the significance of the y-intercept is to establish whether or not the y-intercept is included in the confidence interval of the calculated values of the response, y¿ at xi=0 (see Sec. 2.2.2.1.a).

Once the linearity of the response function has been established and the statistical significance of the intercept determined, the decision is made as to whether applications of the method should be based on multiple-point or two-point calibrations. Alternatively, the initial analysis of the response function by least squares unweighted linear regression may indicate that a non-linear response function is more appropriate (see Sec. 2.2.2.2).

a. Multiple-point calibrations

Multiple point calibration curves are prepared using standard solutions of the analyte in the relevant matrix encompassing the expected concentrations of the analyte in the test sample or samples. The actual range of standard solution concentrations varies with the application and is determined by the range of expected values for the test samples. For the analysis of data by unweighted, least-squares regression the variance of the yi values is assumed to be approximately equal and the standard concentrations should be evenly spaced (e.g. Fig. 2.7).

The confidence interval of a measured analytical response (y¿) for a given standard of known concentration, xi, is given by:

where:

Analysis of calibration data using eqs. 2.21 and 2.22 allows the confidence interval for the measured yi values in the calibration curve to be determined. However, this does not provide information about the confidence intervals for the concentration of a test sample (xCalc) determined by comparison with a calibration curve. In this case the confidence interval of xCalc is obtained from:

where sXca]c is given by the approximation: