For example, in the Tendril DX trial, we have strong prior evidence from other studies that the steroid (treatment group) leads tend to provide lower capture thresh-olds than the nonsteroid (control group) leads.
ONE-TAILED VERSUS TWO-TAILED TESTS
In the previous section, we pointed out that when
determining the significance level of a test we must specify either a
one-tailed or a two-tailed test. The decision should be based on the context of
the problem, i.e., the outcome that we wish to demon-strate. We must consider
the relevant research hypothesis, which becomes the alter-native hypothesis.
For example, in the Tendril DX trial, we have
strong prior evidence from other studies that the steroid (treatment group)
leads tend to provide lower capture thresh-olds than the nonsteroid (control
group) leads. Also, we are interested in marketing our product only if we can
claim, as do our competitors, that our lead reduces cap-ture thresholds by at
least 0.5 volts as compared to nonsteroid leads.
Because we would like to assert that we are able to
reduce capture thresholds, it is natural to look at a one-sided alternative. In
this case, the null hypothesis H0
is μ1
- μ0 < 0
versus the alternative H1
that μ1 – μ0 < 0, where μ1 = the population mean for the treatment group and μ0 = the population mean for the control group. In
Section 9.8, we will see that the appropriate t statistic (under the normality assumption) would have a critical
value determined by t < –ta where ta is the 100(1 – α)
per-centile of Student’s t
distribution with nc + nt – 2 degrees of freedom, nc is the num-ber of
observations in the control group, and nt
is the number of observations in the treatment group.
In the real application, Chernick and associates
took nt = 3nc and chose the values for nc and nt such that the power of the test was at least 80% when
μ1 – μ0 < –0.5; α was set at 0.05. We will
calculate the sample size for this example in Section 9.8 after we introduce
the power function.
In other applications, we may be trying to show
only equivalence in medical ef-fectiveness of a new treatment compared to an
old one. For medical devices or pharmaceuticals, this test of equivalence may occur
when the current product (the control) is an effective treatment and we want to
show that the new product is equally effective. However, the new product may be
preferred for other reasons, such as ease of application. One example might be
the introduction of a simpler needle (called a pen in the industry) to inject
the insulin that controls sugar levels for diabetic patients, as compared to a
standard insulin injection.
In such cases, the null
hypothesis is μ1 – μ0 = 0, versus the alternative μ1 – μ0 ≠ 0; Here, we wish to control the type II error. To do this for β error, we must specify a δ so that
we have a good chance of rejecting equivalence if | μ1 – μ0| > δ. Often, δ is chosen to be some clinically relevant difference in the means. The
sample size would be chosen so that when | μ1 – μ0| > δ, the
probability that the test statistic is large enough to reject H0 is high (80% or 90% or
95%), corresponding to a low type II error (20% or 10% or 5%, respectively).
For this problem, H0 is
rejected when |t| > t α/2 for t α/2 equal to the 100(1 – α/2) percentile of the t
distribution with nc + nt – 2 degrees of freedom;
the value nc is the number
of observations in the control group; nt
is the number of observations in the treatment group.
However, such a test is really backwards because
the scientific hypothesis that we want to confirm is the null hypothesis rather
than the alternative. It is for this reason that Blackwelder and others
(Blackwelder, 1982) have recommended, for equivalence testing (defined in the
foregoing example) and also for noninferiority testing (a one-sided form of
equivalence), that we really want to “prove the null hy-pothesis” in the
Neyman–Pearson framework.
Hence, Blackwelder advocates simply switching the
null and alternative hy-potheses so that rejecting the null hypothesis becomes
rejection of equivalence and accepting the alternative is acceptance of
equivalence. Switching the null and alter-native hypotheses allows us to
control, through type I error, the probability of false-ly claiming
equivalence. When we set the type I (α) and
type II (β) errors (i.e., the type II error at | μ1 – μ0| = δ) to be equal, the distinction between α and β errors be-comes unimportant. The reason the distinction is unimportant
is that if the α = β, both formulations yield the same required sample size for a specified
power. When | μ1 – μ0| = δ but α
≠ β , the test results are different from those when α = β. Because it is common to choose α < β, the Blackwelder approach often is preferred, particularly by the Food
and Drug Administration. For more details see Black-welder’s often-cited
article (Blackwelder, 1982).
Now let us look step by step at a one-tailed
(left-tail) test procedure for the pig blood loss data considered in the
previous section. A left-tailed test means that we reject H0 if we can show that μ< μ0. Alternatively, a right-tailed test denotes
reject-ing H0 if we can
show that μ > μ0.
1. State the null hypothesis H0: μ = μ0 versus the alternative hypothesis H1: μ < μ0.
2. Choose a significance level α = α0 (often we take α0 = 0.05 or 0.01).
3. Determine the critical region,
i.e., the region of values of t in
the lower (left-tail) tail of the sampling distribution for Student’s t distribution with α0 = 0.05 and n
– 1 degrees of freedom when μ = μ0 (i.e., the sampling distribu-tion when the null hypothesis is true).
4. Compute the t statistic: t = ( – μ 0)/(s/√n)
for the given sample and sample size n,
where
is the sample mean and s
is the sample standard deviation.
5. Reject the null hypothesis if
the test statistic t (computed in
step 4) falls in the rejection region for this test; otherwise, do not reject
the null hypothesis.
Again we will use the sample data given in Section
8.9 but this time use the stan-dard deviation s = 717.12. The sample mean is
1085.9 and the sample size n = 10. We
now have enough information to do the test.
We have the following five steps:
1. The null hypothesis is H0: μ = μ0 = 2200 (H0:
μ = 2200) versus the alternative hypothesis H1: μ < μ0 = 2200 (H1: μ < 2200).
2. Choose a significance level α= α0 = 0.05.
3. Determine the critical region,
that is, the region of values of t in
the lower 0.05 tail of the sampling distribution for t (Student’s t
distribution with 9 de-grees of freedom) when μ = μ0 (i.e., the sampling distribution when the null
hypothesis is true). For α0 = 0.05 the critical value is t
= –1.8331; therefore, the critical region includes all values of t < –1.8331.
4. Compute the t statistic: t = ( – μ0)/(s/√n)
for the given sample and sample size n
= 10. We know that n = 10, the sample
mean is 1085.9, s = 717.12, and μ0 = 2200. t = (1085.9 –
2200)/(717.12/√10) = –1114.1/226.773 = –4.913.
5. Since –4.913 is clearly less
than –1.8331, we reject H0
at the 5% level.
In the previous example, if it were appropriate to
use a one-tailed (right tail) test the procedure would change as follows:
In step 1, we would take H1: μ > μ0 = 2200.
In step 3, we would consider the upper α tail of the sampling distribution for t (Student’s t
distribution with 9 degrees of freedom) when μ = μ0 (i.e., the sampling distribution when the null
hypothesis is true).
In step 5, the rejection region would be values of t > 1.8331.
Related Topics
TH 2019 - 2024 pharmacy180.com; Developed by Therithal info.