Sample Size Determination for Confidence Intervals

Chapter: Biostatistics for the Health Sciences: Estimating Population Means

When conducting an experiment or a clinical trial, cost is an important practical consideration.

SAMPLE SIZE DETERMINATION FOR CONFIDENCE INTERVALS

When conducting an experiment or a clinical trial, cost is an important practical consideration. Often, the number of tests in an engineering experiment or the num-ber of patients enrolled in a clinical trial has a major impact on the cost of the ex-periment or trial. We have seen that the variance of the sample mean decreases by a factor of 1/n with an increase in the sample size from 1 to n. This statement implies that in order to obtain precise confidence intervals for the population mean, the larger the sample the better.

But, because of the cost constraints, we may need to trade off precision of our es-timate with the cost of the test. Also, with clinical trials, the number of patients who are enrolled can have a major impact on the time it will take to complete the trial. Two of the main factors that are impacted by sample size are precision and cost; thus, sample size also affects the feasibility of a clinical trial.

The real question we must ask is: “How precise an estimate do I need in order to have useful results?” We will show you how to address this question in order to de-termine a minimum acceptable value for n. Once this minimum n is determined, we can see what this n implies about the feasibility of the experiment or trial. In many epidemiological and other health-related studies, sample size estimation is also of crucial importance. For example, epidemiologists need to know the minimum sam-ple size required in order to detect differences in occurrences of diseases, health conditions, and other characteristics by subpopulations (e.g., smokers versus non-smokers), or in the effects of different exposures or interventions.

In Chapter 9, we will revisit this issue from the perspective of hypothesis testing. The issues in hypothesis testing are the same and the methods of evaluation are very similar to those for sample size estimation based on confidence interval width that we will now describe.

Let us first consider the simplest case of estimating a population mean when the variance σ² is known. In Section 8.4, we saw that a 95% confidence interval is given by [ – 1.96σ/√n, + 1.96 σ /√n]. If we subtract the lower endpoint of the interval from the upper endpoint, we see that the width of the interval is + 1.96σ/√n – + 1.96σ/√n = 2(1.96σ/√n) or 3.92σ/√n.

The way we determine sample size is to put a constraint on the width 3.92σ/√n or the half-width 1.96σ/√n. The half-width represents the greatest distance a point in the interval can be away from the point estimate. So it is a meaningful quantity to constrain. When the main objective is an accurate confidence interval for the parameter the half-width of the interval is a very natural choice. Other objectives such as power of a statistical test can also be used. We specify a maximum value d for this half-width. The quantity d is very much dependent on what would be a mean-ingful interval in the particular trial or experiment. Requiring the half-width to be no larger than d leads to the inequality 1.96 σ/√n ≤ d. Using algebra, we see that n ≥ 1.96σ/d or n ≥ 3.8416 σ²/d². To meet this requirement with the smallest possible integer n, we calculate the quantity 3.8416 σ²/d² and let n be the next inte-ger larger than this quantity. Display 8.7 summarizes the sample size formula using the half-width d of a confidence interval.

Display 8.7. Sample Size Formula Using the Half-Width d of a Confidence Interval

Take n as the next integer larger than (C)²σ²/d²; e.g., for the 95% confidence interval for the mean, take n as the next integer larger than (1.96)²σ²/d².

Let us consider the case where we are sampling from a normal distribution with a known standard deviation of 5, and let us assume that we want the half-width of the 95% confidence interval to be no greater than 0.5. Then d = 0.5 and σ = 5 in this case. Now the quantity 3.8416 σ²/d² is 3.8416(5/0.5)² = 3.8416 (10)² = 3.8416(100) = 384.16. So the smallest integer n that satisfies the required inequality is 385.

In order to solve the foregoing problem we needed to know σ, which in most practical situations will be unknown. Our alternatives are to find or guess at an up-per bound for σ, to estimate σ from a small pilot study, or to refer to the literature for studies that may publish estimates of σ.

Estimating the sample size for the difference between two means is a problem similar to estimating the sample size for a single mean but requires knowing two variances and specifying a relationship between the two sample sizes n_t and n_c.

Recall from Section 8.6 that the 95% confidence interval for the difference be-tween two means of samples selected from two independent normal distributions with known and equal variances is given by . The half-width of this interval is 1.96 σ √[(1/n_t) + (1/n_c)]. Assume n_t = kn_c for some proportionality constant k ≥ 1. The proportionality constant k adjusts for the differences in sample sizes used in the treatment and control groups, as explained in the next paragraph. Let d be the constraint on the half-width. The inequality becomes 1.96 σ √{1/(kn_c)} + {1/(n_c)} = 1.96s √{1/(kn_c)} + {1/(n_c)} = 1.96 σ √[(k + 1)/(kn_c)] ≤ d or kn_c/(k + 1) ≥ 3.8416 σ²/d² or n_c ≥ 3.8416(k + 1)σ²/(kd²). If n_c = 3.8416 (k + 1)σ²/(kd²), then n_t = kn_c = 3.8416 (k + 1)σ²/d². In Display 8.8 we present the sample size formula using the half-width d of a confidence interval for the difference between two population means.

Note that if k = 1, then n_c = n_t = 3.8416 (2σ²/d²). Taking k greater than 1 increases n_t while it lowers n_c, but the total sample size n_t + n_c = (k + 1)² 3.8416 σ²/(kd²).

Display 8.8. Sample Size Formula Using the Half-Width d of a Confidence Interval (Difference Between Two Population Means When the Sample Sizes Are n and kn, where k > 1)

Take n as the next integer larger than (C)²(k + 1)σ²/(kd²); e.g., for the 95% confi-dence interval for the mean, take n as the next integer larger than (1.96)²(k + 1) σ²/(kd²).

For k > 1, the result is larger than 4 (3.8416σ²/d²), the result for k = 1 [since (1 + 1)²= 4]. This calculation shows without loss of generality that k = 1 minimizes the total sample size. However, in clinical trials there may be ethical reasons for wanting n_t to be larger than n_c.

For example, in 1995 Chernick designed a clinical trial (the Tendril DX study) to show that steroid eluting pacing leads were effective in reducing capture thresholds for patients with pacemakers. (For more details, see Chernick, 1999, pp. 63–67). Steroid eluting leads have steroid in the tip of the lead that slowly oozes out into the tissue. This medication is intended to reduce inflammation. The capture threshold is the minimum required voltage for the electrical shock from the lead into the heart that causes the heart to contract (a forced pacing beat). Lower capture thresholds conserve the pacemaker battery and thus allow a longer period before replacement of the pacemaker. The pacing leads are connected from a pacemaker that is implant-ed in the patient’s chest and run through part of the circulatory system into the heart where they provide an electrical stimulus to induce pacing heart beats (beats that re-store normal heart rhythm).

The investigator chose a value of k = 3 for the study because competitors had demonstrated reductions in capture thresholds for their steroid leads that were ap-proved by the FDA based on similar clinical trials. Factors for k such as 2 and 3 were considered because the company and the investigating physicians wanted a much greater percentage of the patients to receive the steroid leads but did not want k to be so large that the total number of patients enrolled would become very expen-sive. Consequently, the physicians who were willing to participate in the trial want-ed to give the steroid leads to most of their patients, as they perceived it to be the better treatment than the use of leads without the steroid.

Chernick actually planned the Tendril DX trial (assuming thresholds were nor-mally distributed) so that he could reject the null hypothesis of no difference in cap-ture threshold versus an alternative hypothesis (i.e., that the difference was at least 0.5 volts with statistical power of 80% as the alternative). In Chapter 9, when we consider sample size for hypothesis testing, we will look again at these assumptions (e.g., statistical power) and requirements.

For now, to illustrate sample size calculations based on confidence intervals, let us assume that we want the half-width of a 95% confidence interval for the mean difference to be no greater than d = 0.2 volts. Assume that both leads have the same standard deviation of 0.8 volts. Then, since n_t = 3.8416 [(k + 1)σ²/d²] = 3.8416[4(0.64/0.04)] = 245.86 or 246 (rounding to the next integer) and n_c = n_t/3 = 82, this gives a total sample size of 328.

Without changing assumptions, suppose we were able to let k = 1. Then n_t = n_c = 3.8416[2σ²/d²] = 3.8416[2(0.64/0.04)] = 122.93 or 123. This modification gives a much smaller total sample size of 246. Note that by going to a 3:1 randomization scheme (i.e., k = 3), n_t increased by a factor of 2 or a total of 123, while n_c decreased by only 41. We call it a 3:1 randomization scheme because the probability is 0.75 that a patient will receive the steroid lead and 0.25 that a patient will receive the nonsteroid lead.

Formulae also can be given for more complex situations. However, in some cases iterative procedures by computer are needed. Currently, there are a number of soft-ware packages available to handle differing confidence sets and hypothesis testing problems under a variety of assumptions. We will describe some of these software packages in Section 16.3. See the related references in Section 8.12 and Section 16.5.

<< Prev Page

Next Page >>

Sample Size Determination for Confidence Intervals

Chapter: Biostatistics for the Health Sciences: Estimating Population Means

Z and t Statistics for Two Independent Samples

Confidence Intervals for the Difference between Means from Two Independent Samples (Variance Known)

Confidence Intervals for the Difference between Means from Two Independent Samples (Variance Unknown)

Bootstrap Principle

Bootstrap Percentile Method Confidence Intervals

Sample Size Determination for Confidence Intervals

Exercises questions answers

Tests of Hypotheses: Terminology

Neyman-Pearson Test Formulation

Test of a Mean (Single Sample, Population Variance Known)

Test of a Mean (Single sample, Population Variance Unknown)

One-Tailed Versus Two-Tailed Tests

p-Values