Often, it is too expensive or impossible to collect information on an entire population.
WHY SELECT A SAMPLE?
Often, it is too expensive or impossible to collect
information on an entire population. For appropriately chosen samples, accurate
statistical estimates of population parameters are possible. Even when we are
required to count the entire population as in a U.S. decennial census, sampling
can be used to improve estimates for important subpopulations (e.g., states,
counties, cities, or precincts).
In the most recent national election, we learned
that the outcome of a presidential election in a single state (Florida) was
close enough to be in doubt as a consequence of various types of counting
errors or exclusion rules. So even when we think we are counting every vote
accurately we may not be; surprisingly, a sample estimate may be more accurate than
a “complete” count.
As an example of a U.S. government agency that uses
sampling, consider the Internal Revenue Service (IRS). The IRS does not have
the manpower necessary to review every tax return for mistakes or
misrepresentation; instead, the IRS chooses a selected sample of returns. The
IRS applies statistical methods to make it more likely that those returns prone
to error or fraud are selected in the sample.
A second example arises from reliability studies,
which may use destructive testing procedures. To illustrate, a medical device
company often tests the peel strength of its packaging material. The company
wants the material to peel when suitable force is applied but does not want the
seal to come open upon normal handling and shipping. The purpose of the seal is
to maintain sterility for medical products, such as catheters, contained in the
packages. Because these catheters will be placed inside patients’ hearts to
treat arrhythmias, maintenance of sterility in order to prevent infection is very
important. When performing reliability tests, it is feasible to peel only a
small percentage of the packages, because it is costly to waste good packag
ing. On the other hand, accurate statistical inference requires selecting
sufficiently large samples.
One of the main challenges of statistics is to
select a sample in an efficient, appropriate way; the goal of sample selection
is to be as accurate as possible in order to draw a meaningful inference about
population characteristics from results of the sample. At this point, it may
not be obvious to you that the method of drawing a sample is important.
However, history has taught us that it is very easy to draw incorrect
inferences because samples were chosen inappropriately.
We often see the results of inappropriate sampling
in television and radio polls. This subtle problem is known as a selection
bias. Often we are interested in a wider target population but the poll is
based only on those individuals who listened to a particular TV or radio
program and chose to answer the questions. For instance, if there is a
political question and the program has a Republican commentator, the audience
may be more heavily Republican than the general target population.
Consequently, the survey results will not reflect the target population. In
this example, we are assuming that the response rate was sufficiently high to
produce reliable results had the sample been random.
Statisticians also call this type of sampling error
response bias. This bias often occurs when volunteers are asked to respond to a
poll. Even if the listeners of a particular radio or TV program are
representative of the target population, those who respond to the poll may not
be. Consequently, reputable poll organizations such as Gallup or Harris use well-established
statistical procedures to ensure that the sample is representative of the
population.
A classic example of failure to select a
representative sample of voters arose from the Literary Digest Poll of 1936. In that year, the Literary Digest mailed out some 10
million ballots asking individuals to provide their preference for the
up-coming election between Franklin Roosevelt and Alfred Landon. Based on the
survey results derived from the return of 2.3 million ballots, the Literary Digest predicted that Landon
would be a big winner.
In fact, Roosevelt won the election with a handy
62% majority. This single poll destroyed the credibility of the Literary Digest and soon caused it to
cease publication. Subsequent analysis of their sampling technique showed that
the list of 10 million persons was taken primarily from telephone directories
and motor vehicle registration lists. In more recent surveys of voters, public
opinion organizations have found random digit dialed telephone surveys, as well
as surveys of drivers, to be acceptable, because almost every home in the
United States has a telephone and almost all citizens of voting age own or
lease automobiles and hence have drivers licenses. The requirement for the
pollsters is not that the list be exhaustive but rather that it be
representative of the entire population and thus not capable of producing a
large response or selection bias. However, in 1936, mostly Americans with high
incomes had phones or owned cars.
The Literary
Digest poll selected a much larger proportion of high-income families than
are typical in the voting population. Also, the high-income families were more
likely to vote Republican than the lower-income families. Consequently, the
poll favored the Republican, Alf Landon, whereas the target population, which
contained a much larger proportion of low-income Democrats than were in the
survey, strongly favored the Democrat, Franklin Roosevelt. Had these economic
groups been sampled in the appropriate proportions, the poll would have correctly
predicted the outcome of the election.
Related Topics
TH 2019 - 2026 pharmacy180.com; Developed by Therithal info.