Statisticians use data from a variety of sources: observational data are from cross-sectional, retrospective, and prospective studies; experimental data are derived from planned experiments and clinical trials.
TYPES OF STUDIES
Statisticians use data from a variety of sources:
observational data are from cross-sectional, retrospective, and prospective
studies; experimental data are derived from planned experiments and clinical
trials. What are some illustrations of the types of data from each of these
sources? Sometimes, observational data have been collected from naturally or
routinely occurring situations. Other times, they are collected for
administrative purposes; examples are data from medical records, government
agencies, or surveys. Experimental data include the results that have been
collected from formal intervention studies or clinical trials; some examples
are survival data, the proportion of patients who recover from a medical
procedure, and relapse rates after taking a new medication.
Most study designs contain one or more outcome
variables that are specified explicitly. (Sometimes, a study design may not
have an explicitly defined outcome variable but, rather, the outcome is
implicit; however, the use of an implicit out-come variable is not a desirable
practice.) Study outcome variables may range from counts of the number of cases
of illness or the number of deaths to responses to an attitude questionnaire.
In some disciplines, outcome variables are called dependent variables. The
researcher may wish to relate these outcomes to disease risk factors such as
exposure to toxic chemicals, electromagnetic radiation, or particular
medications, or to some other factor that is thought to be associated with a
particular health outcome.
In addition to outcome variables, study designs
assess exposure factors. For example, exposure factors may include toxic
chemicals and substances, ionizing radiation, and air pollution. Other types of
exposure factors, more formally known as risk factors, include a lack of
exercise, a high-fat diet, and smoking. In other disciplines, exposure factors
sometimes are called independent variables. However, epidemiologists prefer to
use the term exposure factor.
One important issue pertains to the time frame for
collection of data, whether information about exposure and outcome factors is
referenced about a single point in time or whether it involves looking backward
or forward in time. These distinctions are important because, as we will learn,
they affect both the types of analyses that we can perform and our confidence
about inferences that we can make from the analyses. The following
illustrations will clarify this issue.
A cross-sectional study is referenced about a
single point in time—now. That is, the reference point for both the exposure
and outcome variables is the present time. Most surveys represent
cross-sectional studies. For example, researchers who want to know about the
present health characteristics of a population might administer a survey to
answer the following kinds of questions: How many students smoke at a college
campus? Do men and women differ in their current levels of smoking?
Other varieties of surveys might ask subjects for
self-reports of health character-istics and then link the responses to physical
health assessments. Survey research might ascertain whether current weight is
related to systolic blood pressure levels or whether subgroups of populations
differ from one another in health characteristics; e.g., do Latinos in
comparison to non-Latinos differ in rates of diabetes? Thus, it is apparent
that although the term “cross-sectional study” may seem confusing at first, it
is actually quite simple. Cross-sectional studies, which typically involve
descriptive statistics, are useful for generating hypotheses that may be
explored in future research. These studies are not appropriate for making cause
and effect assertions. Examples of statistical methods appropriate for analysis
of cross-sectional data include cross-tabulations, correlation and regression,
and tests of differences between or among groups as long as time is not an
important factor in the inference.
A retrospective study is one in which the focus
upon the risk factor or exposure fac-tor for the outcome is in the past. One
type of retrospective study is the case-control study, in which patients who
have a disease of interest to the researchers are asked about their prior
exposure to a hypothesized risk factor for the disease. These patients
represent the case data that are matched to patients without the disease but
with similar demographic characteristics.
Health researchers employ case-control studies
frequently when rapid and inexpensive answers to a question are required.
Investigations of food-borne illness require a speedy response to stop the
outbreak. In the hypothetical investigation of a suspected outbreak of E. coli-associated food-borne illness,
public health officials would try to identify all of the cases of illness that
occurred in the outbreak and administer a standardized questionnaire to the
victims in order to determine which foods they consumed. In case-control
studies, statisticians evaluate associations and learn about risk factors and
health outcomes through the use of odds ratios (see Chapter 11).
Prospective studies follow subjects from the
present into the future. In the health sciences, one example is called a
prospective cohort study, which begins with individuals who are free from
disease, but who have an exposure factor. An example would be a study that
follows a group of young persons who are initiating smoking and who are free
from tobacco-related diseases. Researchers might follow these youths into the
future in order to note their development of lung cancer or emphysema. Because
many chronic, noninfectious diseases have a long latency period and low
incidence (occurrence of new cases) in the population, cohort studies are
time-consuming and expensive in comparison to other methodologies. In cohort
studies, epidemiologists often use relative risk (RR) as a measure of
association between risk exposure and disease. The term relative risk is
explained in Chapter 11.
An experimental study is one in which there is a
study group and a control group as well as an independent (causal) variable and
a dependent (outcome) variable. Subjects who participate in the study are
assigned randomly to either the study or control conditions. The investigator
manipulates the independent variable and observes its influence upon the
dependent variable. This study design is similar to those that the reader may
have heard about in a psychology course. Experimental designs also are related
to clinical trials, which were described earlier in this chapter.
Experimental studies are used extensively in product
quality control. The manufacturing and agricultural industries have pioneered
the application of statistical design methods to the production of first-rate,
competitive products. These methods also are used for continuous process
improvement. The following statistical methods have been the key tools in this
success:
·
Design of Experiments (DOE,
methods for varying conditions to look at the effects of certain variables on
the output)
·
Response Surface Methodology
(RSM, methods for changing the experimental conditions to move quickly toward
optimal experimental conditions)
·
Statistical Process Control (SPC,
procedures that involve the plotting of data over time to track performance and
identify changes that indicate possible problems)
·
Evolutionary Operation (EVOP,
methods to adjust processes to reach optimal conditions as processes change or
evolve over time)
Data from such experiments are often analyzed using
linear or nonlinear statistical models. The simplest of these models (simple
linear regression and the one-way analysis of variance) are covered in Chapters
12 and 13, respectively, of this text. However, we do not cover the more
general models, nor do we cover the methods of experimental design and quality
control. Good references for DOE are Montgomery (1997) and Wu and Hamada
(2000). Montgomery (1997) also covers EVOP. Myers and Montgomery (1995) is a
good source for information on RSM. Ryan (1989) and Vardeman and Jobe (1999)
are good sources for SPC and other quality assurance methods.
In the mid-1920s, quality control methods in the
United States began with the work of Shewhart at Bell Laboratories and
continued through the 1960s. In general, the concept of quality control
involves a method for maximizing the quality of goods produced or a
manufacturing process. Quality control entails planning, ongoing inspections,
and taking corrective actions, if necessary, to maintain high standards. This
methodology is applicable to many settings that need to maintain high operating
standards. For example, the U.S. space program depends on highly redundant
systems that use the best concepts from the field of reliability, an aspect of
quality control.
Somehow, the U.S. manufacturing industry in the
1970s lost its knowledge of quality controls. The Japanese learned these ideas
from Ed Deming and others and quickly surpassed the U.S. in quality production,
especially in the automobile industry in the late 1980s. Recently, by
incorporating DOE and SPC methods, US manufacturing has made a comeback. Many
companies have made dramatic improvements in their production processes through
a formalized training program called Six Sigma. A detailed picture of all these
quality control methods can be found in Juran and Godfrey (1999).
Quality control is important in engineering and
manufacturing, but why would a student in the health sciences be interested in
it? One answer comes from the growing medical device industry. Companies now
produce catheters that can be used for ablation of arrhythmias and diagnosis of
heart ailments and also experimentally for injection of drugs to improve the
cardiovascular system of a patient. Firms also produce stents for angioplasty,
implantable pacemakers to correct bradycardia (slow heart rate that causes
fatigue and can lead to fainting), and implantable defibrillators that can
prevent ventricular fibrillation, which can lead to sudden death. These devices
already have had a big impact on improving and prolonging life. Their use and
value to the health care industry will continue to grow.
Because these medical devices can be critical to
the lives of patients, their safety and effectiveness must be demonstrated to
regulatory bodies. In the United States, the governing regulatory body is the
FDA. Profitable marketing of a device generally occurs after a company has
conducted a successful clinical trial of the device. These devices must be
reliable; quality control procedures are necessary to ensure that the
manufacturing process continues to work properly.
Similar arguments can be made for the control of
processes at pharmaceutical plants, which produce prescription drugs that are
important for maintaining the health of patients under treatment. Tablets,
serums, and other drug regimens must be of consistently high quality and contain
the correct dose as described on the label.
A clinical trial is defined as “. . . an experiment
performed by a health care organization or professional to evaluate the effect
of an intervention or treatment against a control in a clinical environment. It
is a prospective study to identify outcome measures that are influenced by the
intervention. A clinical trial is designed to maintain health, prevent
diseases, or treat diseased subjects. The safety, efficacy, pharmacological,
pharmacokinetic, quality-of-life, health economics, or biochemical effects are
measured in a clinical trial.” (Chow, 2000, p. 110).
Clinical trials are conducted with human subjects
(who are usually patients). Before the patients can be enrolled in the trial,
they must be informed about the perceived benefits and risks. The process of
apprising the patients about benefits and risks is accomplished by using an
informed consent form that the patient must sign. Each year in the United
States, many companies perform clinical trials. The impetus for these trials is
the development of new drugs or medical devices that the companies wish to
bring to market. A primary objective of these clinical trials is to demonstrate
the safety and effectiveness of the products to the FDA.
Clinical trials take many forms. In a randomized,
controlled clinical trial, patients are randomized into treatment and control
groups. Sometimes, only a single treatment group and a historical control group
are used. This procedure may be followed when the use of a concurrent control
group would be expensive or would expose patients in the control group to undue
risks. In the medical device industry, the control also can be replaced by an
objective performance criterion (OPC). Established standards for current forms
of available treatments can be used to determine these OPCs. Patients who
undergo the current forms of available treatment thus constitute a control
group. Generally, a large amount of historical data is needed to establish an
OPC.
Concurrent randomized controls are often preferred
to historical controls because the investigators want to have a sound basis for
attributing observed differences between the treatment and control groups to
treatment effects. If the trial is conducted without concurrent randomized
controls, statisticians can argue that any differences shown could be due to
differences among the study patient populations rather than to differences in
the treatment. As an example, in a hypothetical study conducted in Southern
California, a suitable historical control group might consist of Hispanic
women. However, if the treatment were intended for males as well as females
(including both genders from many other races), a historical control group
comprised of Hispanic women would be inappropriate. In addition, if we then
were to use a diverse population of males and females of all races for the
treatment group only, how would we know that any observed effect was due to the
treatment and not simply to the fact that males respond differently from
females or that racial differences are playing a role in the response? Thus,
the use of a concurrent control group would overcome the difficulties produced
by a historical control group.
In addition, in order to avoid potential bias,
patients are often blinded as to study conditions (i.e., treatment or control
group), when such blinding is possible. It is also preferable to blind the
investigator to the study conditions to prevent bias that could invalidate the
study conclusions. When both the investigator and the patient are blinded, the
trial is called double-blinded. Double-blinding often is possible in drug
treatment studies but rarely is possible in medical device trials. In device
trials, the patient sometimes can be blinded but the attending physician cannot
be.
To illustrate the scientific value of randomized,
blinded, controlled, clinical trials, we will describe a real trial that was
sponsored by a medical device company that produces and markets catheters. The
trial was designed to determine the safety and efficacy of direct myocardial
revascularization (DMR). DMR is a clinical procedure designed to improve
cardiac circulation (also called perfusion). The medical procedure involves the
placement of a catheter in the patient’s heart. A small laser on the tip of the
catheter is fired to produce channels in the heart muscle that theoretically
promote cardiac perfusion. The end result should be improved heart function in
those patients who are suffering from severe symptomatic coronary artery disease.
In order to determine if this theory works in
practice, clinical trials were required. Some studies were conducted in which
patients were given treadmill tests before and after treatment in order to
demonstrate increased cardiac output. Other measures of improved heart function
also were considered in these studies. Results indicated promise for the
treatment.
However, critics charged that because these trials
did not have randomized controls, a placebo effect (i.e., patients improve
because of a perceived benefit from knowing that they received a treatment)
could not be ruled out. In the DMR DIRECT trial, patients were randomized to a
treatment group and a sham control group. The sham is a procedure used to keep
the patient blinded to the treatment. In all cases the laser catheter was
placed in the heart. The laser was fired in the patients randomized to the DMR
treatment group but not in the patients randomized to the control group. This
was a single-blinded trial; i.e., none of the patients knew whether or not they
received the treatment. Obviously, the physician conducting the procedure had
to know which patients were in the treatment and control groups. The patients,
who were advised of the possibility of the sham treatment in the informed consent
form, of course received standard care for their illness.
At the follow-up tests, everyone involved,
including the physicians, was blinded to the group associated with the laser
treatment. For a certain period after the data were analyzed, the results were
known only to the independent group of statisticians who had designed the trial
and then analyzed the data.
These results were released and made public in
October 2000. Quoting the press release, “Preliminary analysis of the data
shows that patients who received this laser-based therapy did not experience a
statistically significant increase in exercise times or a decrease in the
frequency and severity of angina versus the control group of patients who were
treated medically. An improvement across all study groups may suggest a
possible placebo effect.”
As a result of this trial, the potential benefit of
DMR was found not to be significant and not worth the added risk to the
patient. Companies and physicians looking for effective treatments for these
patients must now consider alternative therapies. The trial saved the sponsor,
its competitors, the patients, and the physicians from further use of an
ineffective and highly invasive treatment.
As seen in the foregoing section, clinical trials
illustrate one field that requires much biostatistical expertise. Epidemiology
is another such field. Epidemiology is defined as the study of the distribution
and determinants of health and disease in populations.
Although experimental methods including clinical
trials are used in epidemiology, a major group of epidemiological studies use
observational techniques that were formalized during the mid-19th century. In
his classic work, John Snow reported on attempts to investigate the source of a
cholera outbreak that plagued London in 1849. Snow hypothesized that the
outbreak was associated with polluted water drawn from the Thames River. Both
the Lambeth Company and the Southwark and Vauxhall Company provided water
inside the city limits of London. At first, both the Lambeth Company and the
Southwark and Vauxhall Company took water from a heavily polluted section of
the Thames River.
The Broad Street area of London provided an
excellent opportunity to test this hypothesis because households in the same
neighborhood were served by interdigitating water supplies from the two
different companies. That is, households in the same geographic area (even
adjacent houses) received water from the two companies. This observation by
Snow made it possible to link cholera outbreaks in a particular household with
one of the two water sources.
Subsequently, the Lambeth Company relocated its
water source to a less conta minated section of the river. During the cholera
outbreak of 1854, Snow demonstrated that a much greater proportion of residents
who used water from the more polluted source contracted cholera than those who
used water from the less polluted source. Snow’s method, still in use today,
came to be known as a natural experiment [see Friis and Sellers (1999) for more
details].
Snow’s investigation of the cholera outbreak
illustrates one of the main approaches of epidemiology—use of observational
studies. These observational study designs encompass two major categories:
descriptive and analytic. Descriptive studies attempt to classify the extent
and distribution of disease in populations. In contrast, analytic studies are
concerned with causes of disease. Descriptive studies rely on a variety of
techniques: (1) case reports, (2) astute clinical observations, and (3) use of
statistical methods of description, e.g., showing how disease frequency varies
in the population according to demographic variables such as age, sex, race,
and socioeconomic status.
For example; Morbidity
and Mortality Reports, published by the Centers for Disease Control (CDC)
in Atlanta, periodically issues data on persons diagnosed with acquired immune
deficiency syndrome (AIDS) classified according to demographic subgroups within
the United State. With respect to HIV and AIDS, these descriptive studies are
vitally important for showing the nation’s progress in controlling the AIDS
epidemic, identifying groups at high risk, and suggesting needed health care
services and interventions. Descriptive studies also set the stage for analytic
studies by suggesting hypotheses to be explored in further research.
Snow’s natural experiment provides an excellent
example of both descriptive and analytic methodology. The reader can probably
think of many other examples that would interest statisticians. Many natural
experiments are the consequences of government policies. To illustrate,
California has introduced many innovative laws to control tobacco use. One of
these, the Smoke-free Bars Law, has provided an excellent opportunity to
investigate the health effects of prohibiting smoking in alcohol-serving
establishments. Natural experiments create a scenario for researchers to test
causal hypotheses. Examples of analytic research designs include ecological,
case-control, and cohort studies.
We previously defined case-control (Section 1.3.2,
Retrospective Studies) and cohort studies (Section 1.3.3, Prospective Studies).
Case-control studies have been used in such diverse naturally occurring
situations as exploring the causes of toxic shock syndrome among tampon users
and investigating diethylstibesterol as a possible cause of birth defects.
Cohort studies such as the famous Framingham Study have been used in the
investigation of cardiovascular risk factors.
Finally, ecologic studies involve the study of
groups, rather than the individual, as the unit of analysis. Examples are
comparisons of national variations in coronary heart disease mortality or
variations in mortality at the census tract level. In the former example, a
country is the “group,” whereas in the latter, a census tract is the group.
Ecologic studies have linked high fat diets to high levels of coronary heart
disease mortality. Other ecologic studies have suggested that congenital
malformations may be associated with concentrations of hazardous wastes.
Pharmacoeconomics examines the tradeoff of cost
versus benefit for new drugs. The high cost of medical care has caused HMOs,
other health insurers, and even some regulatory bodies to consider the economic
aspects of drug development and marketing. Cost control became an important
discipline in the development and marketing of drugs in the 1990s and will
continue to grow in importance during the current century. Pharmaceutical
companies are becoming increasingly aware of the need to gain expertise in
pharmacoeconomics as they start to implement cost control techniques in
clinical trials as part of winning regulatory approvals and, more importantly,
convincing pharmacies of the value of stocking their products. The
everincreasing cost of medical care has led manufacturers of medical devices
and pharmaceuticals to recognize the need to evaluate products in terms of cost
versus effectiveness in addition to the usual efficacy and safety criteria that
are standard for regulatory approvals. The regulatory authorities in many
countries also see the need for these studies.
Predicting the cost versus benefit of a newly
developed drug involves an element of uncertainty. Consequently, statistical
methods play an important role in such analyses. Currently, there are many
articles and books on projecting the costs versus benefits in new drug
development. A good starting point is Bootman (1996). One of the interesting
and important messages from Bootman’s book is the need to consider a
perspective for the analysis. The perceptions of cost/benefit tradeoffs differ
depending on whether they are seen from the patient’s perspective, the
physician’s perspective, society’s perspective, an HMO’s perspective, or a
pharmacy’s perspective. The perspective has an important effect on which
drug-related costs should be included, what comparisons should be made between
alternative formulations, and which type of analysis is needed. Further
discussion of cost/benefit trade-offs is beyond the scope of this text.
Nevertheless, it is important for health scientists to be aware of such
tradeoffs.
Quality of life has played an increasing role in
the study of medical treatments for patients. Physicians, medical device
companies, and pharmaceutical firms have started to recognize that the
patient’s own feeling of well-being after a treatment is as important or more
important than some clinically measurable efficacy parameters. Also, in
comparing alternative treatments, providers need to realize that many products
are basically equivalent in terms of the traditional safety and efficacy
measures and that what might set one treatment apart from the others could be
an increase in the quality of a patient’s life. In the medical research
literature, you will see many terms that all basically deal with the patients’
view of the quality of their life. These terms and acronyms are quality of life
(QoL), health related quality of life (HRQoL), outcomes research, and patient
reported outcomes (PRO).
Quality of life usually is measured through
specific survey questionnaires. Re-searchers have developed and validated many
questionnaires for use in clinical trials to establish improvements in aspects
of patients’ quality of life. These questionnaires, which are employed to
assess quality of life issues, generate qualitative data.
In Chapter 12, we will introduce you to research
that involves the use of statistical analysis measures for qualitative data.
The survey instruments, their validation and analysis are worthy topics for an
entire book. For example, Fayers and Machin (2000) give an excellent
introduction to this subject matter.
In conclusion, Chapter 1 has presented introductory
material regarding the field of statistics. This chapter has illustrated how
statistics are important in everyday life and, in particular, has demonstrated
how statistics are used in the health sciences. In addition, the chapter has
reviewed major job roles for statisticians. Finally, information was presented
on major categories of study designs and sources of health data that
statisticians may encounter. Tables 1.1 through 1.3 review and summarize the
key points presented in this chapter regarding the uses of statistics, job
roles for statisticians, and sources of health data.
1. Interpret research studies
Example: Validity of findings of health education
and medical research
2. Evaluate statistics used every
day
Examples: Hospital mortality rates, prevalence of
infectious diseases
3. Presentation of data to
audiences
Effective arrangement and grouping of information
and graphical display of data
4. Illustrate central tendency
and variability
5. Formulate and test hypotheses
Generalize from a sample to the population.
1. Guide design of an experiment,
clinical trial, or survey
2. Formulate statistical
hypotheses and determine appropriate methodology
3. Analyze data
4. Present and interpret results
1. Archival and vital statistics records
2. Experiments
3. Medical research studies
Retrospective—case control
Prospective—cohort study
4. Descriptive surveys
5. Clinical trials
Related Topics
TH 2019 - 2024 pharmacy180.com; Developed by Therithal info.