Estimation Versus Hypothesis Testing

Chapter: Biostatistics for the Health Sciences: Estimating Population Means

In this section, we move from descriptive statistics to inferential statistics.

Estimating Population Means

ESTIMATION VERSUS HYPOTHESIS TESTING

In this section, we move from descriptive statistics to inferential statistics. In descriptive statistics, we simply summarize information available in the data we are given. In inferential statistics, we draw conclusions about a population based on a sample and a known or assumed sampling distribution. Implicit in statistical infer-ence is the assumption that the data were gathered as a random sample from a population.

Examples of the types of inferences that can be made are estimation, conclusions from hypothesis tests, and predictions of future observations. In estimation, we are interested in choosing the “best” estimate of a population parameter based on the sample and statistical theory.

For example, as we saw in Chapter 7, when data are sampled from a normal dis-tribution, the sample mean has a normal distribution that is on average equal to the population mean with a variance equal to the population variance divided by the sample size n. Recall that the distribution of a statistic such as a sample mean is called a sampling distribution. The Gauss–Markov theory goes on to determine that the sample mean is the best estimate of the population mean. That means that for a sample of size n it gives us the most accurate answer (e.g., has properties such as smallest mean square error and minimum variance among unbiased estimators).

The sample mean is a point estimate, but we know it has a sampling distribution. Hence, the sample mean will not be exactly equal to the population mean. However, the theory we have tells us about its sampling distribution; thus, statistical theory can aid us in describing our uncertainty about the population mean based on our knowledge of the sampling distribution for the sample mean.

In Section 8.2, we will further discuss point estimates and in Section 8.3 we will discuss confidence intervals. Confidence intervals are merely interval estimates (based on the observed data) of population parameters that express a range of val-ues that are likely to contain the parameter. We will describe how the sampling dis-tribution of the point estimate is used to get confidence intervals in Section 8.3.

In hypothesis testing, we construct a null and an alternative hypothesis. Usually, the null hypothesis is an uninteresting hypothesis that we would like to reject. You will see examples in Chapter 9. The alternative hypothesis is generally the interest-ing scientific hypothesis that we would like to “prove.” However, we do not actual-ly “prove” the alternative hypothesis; we merely reject the null hypothesis and re-tain a degree of uncertainty about its status.

Due to statistical uncertainty, one can never absolutely prove a hypothesis based on a sample. We will draw conclusions based on our sample data and associate an error probability with our possible conclusion. When our conclusion favors the null hypothesis, we prefer to say that we fail to reject the null hypothesis rather than that we accept the null hypothesis.

In setting up the hypothesis test, we will determine a critical value in advance of looking at the data. This critical value is selected to control the type I error (i.e., the probability of falsely rejecting the null hypothesis). This is the so-called Ney-man–Pearson formulation that we will describe in Section 9.2.

In Section 9.9, we will describe a relationship between confidence intervals and hypothesis tests that enables one to construct a hypothesis test from a confidence in-terval or a confidence interval from a hypothesis test. Usually, hypothesis tests are constructed based directly on the sampling distribution of the point estimate. How-ever, in Chapter 9 we will introduce the simplest form of bootstrap hypothesis test-ing. This test is based on a bootstrap percentile method confidence interval that we will introduce in Section 8.8.

<< Prev Page

Next Page >>

Estimation Versus Hypothesis Testing

Chapter: Biostatistics for the Health Sciences: Estimating Population Means

Standard Error of the Mean

Z Distribution Obtained When Standard Deviation Is Known

Student’s t Distribution Obtained When Standard Deviation Is Unknown

Assumptions Required for t Distribution

Exercises questions answers

Estimation Versus Hypothesis Testing

Point Estimates

Confidence Intervals

Confidence Intervals for a Single Population Mean

Z and t Statistics for Two Independent Samples

Confidence Intervals for the Difference between Means from Two Independent Samples (Variance Known)

Confidence Intervals for the Difference between Means from Two Independent Samples (Variance Unknown)

Bootstrap Principle