Biostatistics for the Health Sciences: Estimating Population Means - Confidence Intervals for the Difference between Means from Two Independent Samples (Variance Unknown)

**CONFIDENCE INTERVALS FOR THE DIFFERENCE BETWEEN MEANS FROM TWO
INDEPENDENT SAMPLES (POPULATION VARIANCE UNKNOWN)**

In the case when the variances of the parent
populations from which the samples are selected are unknown, we use the *t* statistic with the pooled variance
formula from Section 8.5 assuming normal distributions and equal variances.
When the variances are assumed to be unequal and the distributions normal, we
use the *k* statistic from Section 8.5
with the individual sample variances. When using *k*, we apply the Welch–Aspin *t*
approximation with *v* degrees of freedom where *v* is defined as in Section 8.5.

In the first case the 95% confidence interval is

,
where S_{p} is the pooled estimate of the standard deviation and *C* is the appropriate constant such that *P*(–*C*
≤ *t* ≤ *C*) = 0.95 when *t* has a
Student’s *t* distribution with *n _{t}* +

Now recall that *S _{p}*

**Display 8.4. 95% Confidence Interval For the Difference Between Two
Population Means (Common Population Variance Known)**

as follows:

From the *t*
table we see that *C* = 2.0687 since
the degrees of freedom are 23. Using this value for *C* we get the following:

[99.5–2.0687{121.62 √[(1/9) + (1/16)]}, 99.5 + 2.0687{121.62 √ [(1/9) +
(1/16)]}]

= [99.5–249.53(0.1736, 99.5 + 249.53(0.1736] =

= [99.5–249.53(0.4167), 99.5 + 249.53(0.4167)] =

= [99.5–103.98, 99.5 + 103.98] = [–4.48, 203.48]

In the second case, the 95% confidence interval is

,

where *S*^{2}_{t} is the sample
estimate of variance for the treatment group and *S _{c}*

**Display 8.5.** **95% Confidence Interval
For the Difference Between Two Population Means (Common Population Variance
Unknown)**

Let us consider an example from the pharmaceutical
industry. A company is in-terested in marketing a clotting agent that reduces
blood loss when an accident causes an internal injury such as liver trauma. To
study possible doses of the agent and obtain some indication of safety and
efficacy, the company conducts an experiment in which a controlled liver
injury is induced in pigs and blood loss is mea-sured. Pigs are randomized as
to whether they receive the drug after the injury or do not receive drug
therapy—the treatment and control groups, respectively.

The following data were taken from a study in which
there were 10 pigs in the treatment group and 10 in the control group. The
blood loss was measured in milli-liters and is given in Table 8.1.

When the variances are known, we use the *Z* statistic defined in the previous
section, namely

*Z *has exactly the standard normal distribution when
the observations in both sam-ples are normally distributed. Also, based on the
central limit theorem, *Z* is
approx-imately normal if conditions for the central limit theorem are satisfied
for each population being sampled. So for a 95% confidence interval we know
that *P*(–*C* ≤ *Z* ≤ C) = 0.95 if C = 1.96. So 1.96). After some algebra we find that

**TABLE 8.1. Pig Blood Loss Data (ml)**

So the 95% confidence interval is

For
other confidence levels we just change the constant *C* to 1.645 for 90% or 2.575 for 99%.

For these data, we note a large difference between
the sample standard devia-tions: 717.12 for the treatment group versus 1824.27
for the control group. This result is not compatible with the assumption of
equal variance. We will make the as-sumption anyway to illustrate the
calculation. We will then revisit this example and calculate the confidence
interval obtained, dropping the equal variance assumption and using the *t* approximation with the *k* statistic. In Section 8.9, we will
look at the result we would obtain from a bootstrap percentile method
confidence interval where the questionable normality assumption can be dropped.
In Chapter 9, we will look at the conclusions of various hypothesis tests based
on these pig blood loss data and various assumptions about the population
variances. We will revisit the ex-ample one more time in Section 14.3, where we
will apply a nonparametric tech-nique called the Wilcoxon rank–sum test to
these data.

Using the formula for the estimated common variance
(Display 8.5), we must calculate the pooled variance *S* _{p}^{2}.
The term *S* ^{2}* _{p}* = {

In Chapter 9 (on
hypothesis testing), you will learn that because the interval does not contain
0, you are able to reject the hypothesis of no difference in average blood
loss.

We note that if we had chosen a 90% confidence
interval C = 1.7341 (based on the tables for Student’s t distribution), the
resulting interval would be [(1085.9 – 2187.4) – 1.7341(1475.89) √0.1, (1085.9 – 2187.4) + 1.7341(1475.89) √0.1] =
[–1101.5 – 809.33, –1101.5 + 809.33] = [–1910.83, –292.17].

Now let us look at the result obtained from
assuming unequal variances, a more realistic assumption (refer to Display 8.6).
The confidence interval would then be ~~,~~ where C is obtained from a
Student’s t distribution with *v*
degrees of freedom and

Using *S _{t}*
= 717.12 and

We solve for *x*
as the interpolated value for *C*. The
simple way to remember the change in degrees of freedom from 12 to 11.717 is to
define the change in degrees of freedom from 12 to 11 as the change in *C* from the value for 12 degrees of free*C* from 12 degrees of freedom to 11 degrees of freedom. So 0.283/1 = (2.1788 – *x*)/–0.0222 or –0.283(0.0222) = 2.1788 – *x* or *x* = 2.1788 + 0.283(0.0222) = 2.1788 + 0.0063 = 2.1851.

So taking *C* = 2.185, the 95% confidence interval is [(1085.9 – 2187.4) – 2.185 √332796.1, (1085.9 – 2187.4) + 2.185√332796.1] = [–1101.5 – 1260.49, –1101.5 + 1260.49] = [–2361.99, 158.99].

We note that this interval is different from the previous calculation for the com-mon variance estimate and perhaps more realistic. The conclusion is also qualita-tively different from the previous calculation because in this case the interval con-tains 0, whereas under the equal variance assumption it did not!

**Display 8.6.** **A 95% Confidence
Interval for a Difference Between two Population Means (Different Unknown
Population Variances)**

where:

*n _{t} *is the sample size for the treatment group

*S*^{2}_{t}* *is the sample
estimate of variance for the treatment group*
*

*n _{c} *is the sample size for the control group

*S _{c}*

*C *is the 97.5
percentile of the* t *distribution with* **v** *degrees of freedom with* **v** *given by

Related Topics

Contact Us,
Privacy Policy,
Terms and Compliant,
DMCA Policy and Compliant

TH 2019 - 2024 pharmacy180.com; Developed by Therithal info.