First we will consider a single proportion and the approximate intervals based on the normal distribution.

**CONFIDENCE INTERVALS FOR PROPORTIONS**

First we will consider a single proportion and the
approximate intervals based on the normal distribution. If *W* is *X*/*n*, where *X* is a binomially distributed random variable with parameters *n* and *p*, then by the central limit theorem *W* is approximately normally distributed with mean *p* and variance *p*(1 – *p*)/*n*. Therefore, (*W* – *p*)/* *√{*p*(1 –* p*)/*n*}* *has an approximately standard normal
distribution.

Because *p*
is unknown, we cannot normalize *W* by
dividing *W* by *p*. Instead, we consider the quantity *U* = (*W* – *p*)/ √{*W*(1 – *W*)/*n*}. Since *W* is a consistent estimate of *p*, this quantity* U *converges to a standard normal random variable as the sample* *size *n* increases.

Therefore, we use the fact that if *U* were standard normal, then *P*[–1.96 ≤ *U* ≤ 1.96] = 0.95 or *P*[–1.96 ≤ (*W* – *p*)/
√{*W*(1 – *W*)/*n*} ≤ 1.96] = 0.95 or, after the usual
algebraic manipulations, *P*[*W* – 1.96 √{*W*(1 – *W*)/*n*}
≤ *p* ≤ *W* + 1.96 √{*W*(1 – *W*)/*n*}]. So the random interval [*W* – 1.96 √{*W*(1 – *W*)/*n}*,
*W* + 1.96 √{*W*(1 – *W*)/*n*]} is an approximate 95% confidence interval for a single
proportion *p*.

[*W* – 1.96
√{*W*(1 –* W*)/*n}*,

*W *+ 1.96 √{*W*(1 –* W*)/*n*]} (10.6)

where *W* =
*X*/*n*
and *X* is binomially distributed with
parameters *n* and *p*. For other confidence levels, change 1.96 to the appropriate
constant *C* from the standard nor-mal
distribution.

As an example, suppose that we have 16 successes in
20 trials; *X* = 16 and *n* = 20. What would be an approximate 95%
confidence interval for the population proportion of successes, *p*? From Equation 10.6, since *W* = 16/20 = 0.80, we have [0.80 - 1.96 √[0.8(0.2)/20], 0.80 + 1.96 √{0.8(0.2)/20}]
= [0.80 – 0.1753, 0.80 + 0.1753] = [0.625, 0.975]. Later we will compare this
interval to the exact interval obtained by the Clopper–Pearson method.

Now let us consider two independent estimates of
proportions, *W*_{1} = *X*_{1}/*n*_{1} and *W*_{2}
= *X*_{2}/*n*_{2}, where *X*_{1}
is a binomial random variable with parameters *p*_{1 }and *n*_{1}
and *X*_{2} is a binomial
random variable with parameters *p*_{2}
and *n*_{2}. Then,_{ }Z
= (*W*_{1} – *W*_{2}) – (*p*_{1} – *p*_{2})/
√{[*W*_{1}(1 – *W*_{1})/*n*_{1} + *W*_{2}(1
– *W*_{2})/*n*_{2}]} has an approximately standard normal distribution.
Therefore, *P*[–1.96 ≤ *Z* ≤ 1.96] is approximately_{ }0.95. After
substitution and algebraic manipulations, we have *P*[(*W*_{1} – *W*_{2}) - 1.96 √ {[*W*_{1}(1 – *W*_{1})/*n*_{1} + *W*_{2}(1
– *W*_{2})/*n*_{2}]} ≤ (*p*_{1} – *p*_{2}) ≤ [(*W*_{1} – *W*_{2}) +1.96 √{[*W*_{1}(1 – *W*_{1})/*n*_{1}
+ *W*_{2}(1 – *W*_{2})/*n*_{2}]}. The probability that *p*_{1} – *p*_{2}
lies within this interval is approximately 0.95; hence, the random interval [(*W*_{1} – *W*_{2}) – 1.96 √{[*W*_{1}(1 – *W*_{1})/*n*_{1}
+ *W*_{2}(1 – *W*_{2})/*n*_{2}]}[(*W*_{1}
– *W*_{2}) + 1.96 √{[*W*_{1}(1 – *W*_{1})/*n*_{1} + *W*_{2}(1
– *W*_{2})/*n*_{2}]} is an approximate 95% confidence interval for* p*_{1}* *–* p*_{2}.

An approximate 95% confidence interval for the
difference between two propor-tions *p*_{1}
– *p*_{2} is

[(*W*_{1}–*W*_{2}) – 1.96 √{*W*_{1}(1 – *W*_{1})/*n*_{1} + *W*_{2}(1 – *W*_{2})/*n*_{2}},

(*W*_{1}
– *W*_{2}) + 1.96 √{*W*_{1}(1 – *W*_{1})/*n*_{1} + (*W*_{2}(1 – *W*_{2})/*n*_{2})]} (10.7)

where *W*_{1}
= *X*_{1}/*n*_{1} and *X*_{1}
is binomially distributed with parameters *n*_{1}
and *p*_{1}, and *W*_{2}* *=* X*_{2}/*n*_{2}* *and* X*_{2}* *is binomially distributed with
parameters* n*_{2}* *and*
p*_{2}. For other* *confidence
levels, change 1.96 to the appropriate constant *C* from the standard nor-mal distribution.

For a numerical example, suppose *n*_{1} is 100 and *n*_{2} is 50. Suppose *X*_{1} = 85 and *X*_{2 }= 26. We will calculate
the approximate 95% and 99% confidence intervals for *p*_{1}* *–* p*_{2}* *when* W*_{1}* *= 85/100 = 0.85 and* W*_{2}* *= 26/50 =
0.52. In the case of the 95% confidence interval, the constant *C* = 1.96; hence, the interval is [(0.85
– 0.52) – 1.96 √{0.85(0.15)/100 + 0.52(0.48)/50}, (0.85–0.52)+1.96 √{0.85(0.15)/100 + 0.52(0.48)/50]} = [0.175, 0.485].

For exact intervals, the Clopper–Pearson method is
used. Clopper and Pearson (1934) provided the results of their method in graphical
form. Hahn and Meeker (1991) reprinted Clopper and Pearson’s work, along with
much detail about confi-dence intervals. The two-sided interval uses the *F* distribution with the 100(1 – *α*)% interval given by Equation
10.8. We will learn about the *F*
distribution in Chapter 13.

The exact 100(1 – *a*)%
confidence interval for a single binomial proportion is

[{1 + (*n*
– *x* + 1)*F*(1 – *a*/2:2*n* – 2*x* + 2, 2*x*)/*x*}^{–1}, {1 +
(*n* – *x*)/{(*x* + 1)*F*(1 – *a*/2:2*x* + 2, 2*n* – 2*x*)}}^{–1}]

where *x*
is the number of successes in *n*
Bernoulli trials and *F*(*γ*: *dfn*, *dfd*) is the 100* γ* th percentile of an *F* distribution with *dfn* degrees of freedom for the numerator and *dfd* degrees of freedom for the denominator. For the lower endpoint,
*γ* = 1 – *a*/2,* **dfn** *= 2*n** *– 2*x*, and* **dfd** *= 2*x*. For
the upper endpoint,* **γ** *= 1 –* **α*/2,* **dfn** *= 2*x** *+ 2,* *and *dfd* = 2*n*–2*x*.

Now let us revisit the example for approximate
confidence intervals where *X* = 16, *n* = 20, and 1 – *α*/2 = 0.95. The above equation
becomes [{1 + 5 *F*(0.95: 10, 32)/ 16}^{–1},
{1 + 4/{5 *F*(0.95: 34, 8)}}^{–1}].
For now we will take these percentiles by con-sulting a table for the *F* distribution. From the table (Appendix
*A*), we see that *F*(0.95: 10, 32) = 2.94 and* F*(0.95:
34, 8) = 5.16 (by interpolation between* F*(0.95,* *30, 8) = 5.20 and *F*(0.95, 40, 8) = 5.11. Plugging these values into Equation 10.8, we
obtain the interval [0.521, 0.866]. The value 0.95 tells us the percentile to
look up in the table; the two other parameters are the numerator and
denominator de-grees of freedom, to be defined in Chapter 12.

Compare this new interval to the interval from the
normal approximation [0.625, 0.975]. Note that the widths of the intervals are
about the same, but the normal ap-proximation gives a symmetric interval
centered at 0.80. The reason for the differ-ence is that the sample size of 20
is too small for the normal approximation to be very good, as the true
proportion is probably close to 0.80; the Binomial distribu-tion, though
centered at 0.80, is much more skewed than a normal distribution and has a
longer left tail than right tail. In this case, the exact binomial solution is
appro-priate but the normal approximation is not.

If *n* were
100, the normal approximation and the exact Binomial distribution would be in
much closer agreement. So let us make the comparison when *n* = 100 and *x* = 80. The
normal approximation gives [0.80–1.96 √{0.8(0.2)/100},
0.80 + 1.96 √{0.8(0.2)/100}] = [0.722, 0.878], whereas the Clopper–Pearson method
gives [{1 + 21 *F*(0.95: 42, 160)/80}^{–1},
{1 + 20/{81 *F*(0.95: 162, 40)}}^{–1}].
We have *F*(0.95: 42, = 1.72 (by
interpolation in the table, Appendix A) and *F*(0.95:
162, 40) = 1.90 (also by interpolation in the table). Substituting these values
in the equation above gives the interval [0.689, 0.885]. We note that the
normal approximation, though not as accurate as we would like, is much closer
to the exact result when the sample size is 100 as compared to when the sample
size is only 20.

Related Topics

Contact Us,
Privacy Policy,
Terms and Compliant,
DMCA Policy and Compliant

TH 2019 - 2023 pharmacy180.com; Developed by Therithal info.