# Why Does Random Sampling Work?

| Home | | Advanced Mathematics |

## Chapter: Biostatistics for the Health Sciences: Defining Populations and Selecting Samples

We have illustrated an important property of simple random sampling, namely, that estimates of population averages are unbiased.

WHY DOES RANDOM SAMPLING WORK?

We have illustrated an important property of simple random sampling, namely, that estimates of population averages are unbiased. Under certain conditions, appropriately chosen stratified random samples can produce unbiased estimates with better accuracy than simple random samples (see Cochran, 1977).

A quantity that provides a description of the accuracy of the estimate of a popu-lation mean is called the variance of the mean, and its square root is called the stan-dard error of the mean. The symbol σ2 is used to denote the population variance. (Chapter 4 will provide the formulas for σ2.) When the population size N is very large, the sampling variance of the sample mean is known to be approximately σ2/n for a sample size of n.

In fact, as Cochran (1977) has shown, the exact value of this sample variance is slightly smaller than the population variance due to the finite number N for the population. To correct for this slightly smaller estimate, a correction factor is applied (see Chapter 4). If n is small relative to N, this correction factor can be ignored. The fact that the variance of the sample mean is approximately σ2/n tells us that since the variance of the sample mean becomes small as n becomes large, individual sam-ple means will be highly accurate.

Kuzma illustrated the phenomenon that large sample sizes produce highly accu-rate estimates of the population mean with his Honolulu Heart Study data (Kuzma, 1998; Kuzma and Bohnenblust, 2001). For his data, the population size for the male patients was N = 7683 (a relatively large number).

Kuzma determined that the population mean for his data was 54.36. Taking re-peated samples of n = 100, Kuzma examined the mean age of the male patients. Choosing five simple random samples of size n = 100, he obtained sample means of 54.85, 54.31, 54.32, 54.67, and 54.02. All these estimates were within one-half year of the population mean. In Kuzma’s example, the variance of the sample means was small and n was large. Consequently, all sample estimates were close to one anoth-er and to the population mean. Thus, in general we can say that the larger the n, the more closely the sample estimate of the mean approaches the population mean.

Related Topics