# Types of Studies

| Home | | Advanced Mathematics |

## Chapter: Biostatistics for the Health Sciences: What is Statistics? How Is It Applied to the Health Sciences?

Statisticians use data from a variety of sources: observational data are from cross-sectional, retrospective, and prospective studies; experimental data are derived from planned experiments and clinical trials.

TYPES OF STUDIES

Statisticians use data from a variety of sources: observational data are from cross-sectional, retrospective, and prospective studies; experimental data are derived from planned experiments and clinical trials. What are some illustrations of the types of data from each of these sources? Sometimes, observational data have been collected from naturally or routinely occurring situations. Other times, they are collected for administrative purposes; examples are data from medical records, government agencies, or surveys. Experimental data include the results that have been collected from formal intervention studies or clinical trials; some examples are survival data, the proportion of patients who recover from a medical procedure, and relapse rates after taking a new medication.

Most study designs contain one or more outcome variables that are specified explicitly. (Sometimes, a study design may not have an explicitly defined outcome variable but, rather, the outcome is implicit; however, the use of an implicit out-come variable is not a desirable practice.) Study outcome variables may range from counts of the number of cases of illness or the number of deaths to responses to an attitude questionnaire. In some disciplines, outcome variables are called dependent variables. The researcher may wish to relate these outcomes to disease risk factors such as exposure to toxic chemicals, electromagnetic radiation, or particular medications, or to some other factor that is thought to be associated with a particular health outcome.

In addition to outcome variables, study designs assess exposure factors. For example, exposure factors may include toxic chemicals and substances, ionizing radiation, and air pollution. Other types of exposure factors, more formally known as risk factors, include a lack of exercise, a high-fat diet, and smoking. In other disciplines, exposure factors sometimes are called independent variables. However, epidemiologists prefer to use the term exposure factor.

One important issue pertains to the time frame for collection of data, whether information about exposure and outcome factors is referenced about a single point in time or whether it involves looking backward or forward in time. These distinctions are important because, as we will learn, they affect both the types of analyses that we can perform and our confidence about inferences that we can make from the analyses. The following illustrations will clarify this issue.

### 1. Surveys and Cross-Sectional Studies

A cross-sectional study is referenced about a single point in time—now. That is, the reference point for both the exposure and outcome variables is the present time. Most surveys represent cross-sectional studies. For example, researchers who want to know about the present health characteristics of a population might administer a survey to answer the following kinds of questions: How many students smoke at a college campus? Do men and women differ in their current levels of smoking?

Other varieties of surveys might ask subjects for self-reports of health character-istics and then link the responses to physical health assessments. Survey research might ascertain whether current weight is related to systolic blood pressure levels or whether subgroups of populations differ from one another in health characteristics; e.g., do Latinos in comparison to non-Latinos differ in rates of diabetes? Thus, it is apparent that although the term “cross-sectional study” may seem confusing at first, it is actually quite simple. Cross-sectional studies, which typically involve descriptive statistics, are useful for generating hypotheses that may be explored in future research. These studies are not appropriate for making cause and effect assertions. Examples of statistical methods appropriate for analysis of cross-sectional data include cross-tabulations, correlation and regression, and tests of differences between or among groups as long as time is not an important factor in the inference.

### 2. Retrospective Studies

A retrospective study is one in which the focus upon the risk factor or exposure fac-tor for the outcome is in the past. One type of retrospective study is the case-control study, in which patients who have a disease of interest to the researchers are asked about their prior exposure to a hypothesized risk factor for the disease. These patients represent the case data that are matched to patients without the disease but with similar demographic characteristics.

Health researchers employ case-control studies frequently when rapid and inexpensive answers to a question are required. Investigations of food-borne illness require a speedy response to stop the outbreak. In the hypothetical investigation of a suspected outbreak of E. coli-associated food-borne illness, public health officials would try to identify all of the cases of illness that occurred in the outbreak and administer a standardized questionnaire to the victims in order to determine which foods they consumed. In case-control studies, statisticians evaluate associations and learn about risk factors and health outcomes through the use of odds ratios (see Chapter 11).

### 3. Prospective Studies

Prospective studies follow subjects from the present into the future. In the health sciences, one example is called a prospective cohort study, which begins with individuals who are free from disease, but who have an exposure factor. An example would be a study that follows a group of young persons who are initiating smoking and who are free from tobacco-related diseases. Researchers might follow these youths into the future in order to note their development of lung cancer or emphysema. Because many chronic, noninfectious diseases have a long latency period and low incidence (occurrence of new cases) in the population, cohort studies are time-consuming and expensive in comparison to other methodologies. In cohort studies, epidemiologists often use relative risk (RR) as a measure of association between risk exposure and disease. The term relative risk is explained in Chapter 11.

### 4. Experimental Studies and Quality Control

An experimental study is one in which there is a study group and a control group as well as an independent (causal) variable and a dependent (outcome) variable. Subjects who participate in the study are assigned randomly to either the study or control conditions. The investigator manipulates the independent variable and observes its influence upon the dependent variable. This study design is similar to those that the reader may have heard about in a psychology course. Experimental designs also are related to clinical trials, which were described earlier in this chapter.

Experimental studies are used extensively in product quality control. The manufacturing and agricultural industries have pioneered the application of statistical design methods to the production of first-rate, competitive products. These methods also are used for continuous process improvement. The following statistical methods have been the key tools in this success:

·        Design of Experiments (DOE, methods for varying conditions to look at the effects of certain variables on the output)

·        Response Surface Methodology (RSM, methods for changing the experimental conditions to move quickly toward optimal experimental conditions)

·        Statistical Process Control (SPC, procedures that involve the plotting of data over time to track performance and identify changes that indicate possible problems)

·        Evolutionary Operation (EVOP, methods to adjust processes to reach optimal conditions as processes change or evolve over time)

Data from such experiments are often analyzed using linear or nonlinear statistical models. The simplest of these models (simple linear regression and the one-way analysis of variance) are covered in Chapters 12 and 13, respectively, of this text. However, we do not cover the more general models, nor do we cover the methods of experimental design and quality control. Good references for DOE are Montgomery (1997) and Wu and Hamada (2000). Montgomery (1997) also covers EVOP. Myers and Montgomery (1995) is a good source for information on RSM. Ryan (1989) and Vardeman and Jobe (1999) are good sources for SPC and other quality assurance methods.

In the mid-1920s, quality control methods in the United States began with the work of Shewhart at Bell Laboratories and continued through the 1960s. In general, the concept of quality control involves a method for maximizing the quality of goods produced or a manufacturing process. Quality control entails planning, ongoing inspections, and taking corrective actions, if necessary, to maintain high standards. This methodology is applicable to many settings that need to maintain high operating standards. For example, the U.S. space program depends on highly redundant systems that use the best concepts from the field of reliability, an aspect of quality control.

Somehow, the U.S. manufacturing industry in the 1970s lost its knowledge of quality controls. The Japanese learned these ideas from Ed Deming and others and quickly surpassed the U.S. in quality production, especially in the automobile industry in the late 1980s. Recently, by incorporating DOE and SPC methods, US manufacturing has made a comeback. Many companies have made dramatic improvements in their production processes through a formalized training program called Six Sigma. A detailed picture of all these quality control methods can be found in Juran and Godfrey (1999).

Quality control is important in engineering and manufacturing, but why would a student in the health sciences be interested in it? One answer comes from the growing medical device industry. Companies now produce catheters that can be used for ablation of arrhythmias and diagnosis of heart ailments and also experimentally for injection of drugs to improve the cardiovascular system of a patient. Firms also produce stents for angioplasty, implantable pacemakers to correct bradycardia (slow heart rate that causes fatigue and can lead to fainting), and implantable defibrillators that can prevent ventricular fibrillation, which can lead to sudden death. These devices already have had a big impact on improving and prolonging life. Their use and value to the health care industry will continue to grow.

Because these medical devices can be critical to the lives of patients, their safety and effectiveness must be demonstrated to regulatory bodies. In the United States, the governing regulatory body is the FDA. Profitable marketing of a device generally occurs after a company has conducted a successful clinical trial of the device. These devices must be reliable; quality control procedures are necessary to ensure that the manufacturing process continues to work properly.

Similar arguments can be made for the control of processes at pharmaceutical plants, which produce prescription drugs that are important for maintaining the health of patients under treatment. Tablets, serums, and other drug regimens must be of consistently high quality and contain the correct dose as described on the label.

### 5. Clinical Trials

A clinical trial is defined as “. . . an experiment performed by a health care organization or professional to evaluate the effect of an intervention or treatment against a control in a clinical environment. It is a prospective study to identify outcome measures that are influenced by the intervention. A clinical trial is designed to maintain health, prevent diseases, or treat diseased subjects. The safety, efficacy, pharmacological, pharmacokinetic, quality-of-life, health economics, or biochemical effects are measured in a clinical trial.” (Chow, 2000, p. 110).

Clinical trials are conducted with human subjects (who are usually patients). Before the patients can be enrolled in the trial, they must be informed about the perceived benefits and risks. The process of apprising the patients about benefits and risks is accomplished by using an informed consent form that the patient must sign. Each year in the United States, many companies perform clinical trials. The impetus for these trials is the development of new drugs or medical devices that the companies wish to bring to market. A primary objective of these clinical trials is to demonstrate the safety and effectiveness of the products to the FDA.

Clinical trials take many forms. In a randomized, controlled clinical trial, patients are randomized into treatment and control groups. Sometimes, only a single treatment group and a historical control group are used. This procedure may be followed when the use of a concurrent control group would be expensive or would expose patients in the control group to undue risks. In the medical device industry, the control also can be replaced by an objective performance criterion (OPC). Established standards for current forms of available treatments can be used to determine these OPCs. Patients who undergo the current forms of available treatment thus constitute a control group. Generally, a large amount of historical data is needed to establish an OPC.

Concurrent randomized controls are often preferred to historical controls because the investigators want to have a sound basis for attributing observed differences between the treatment and control groups to treatment effects. If the trial is conducted without concurrent randomized controls, statisticians can argue that any differences shown could be due to differences among the study patient populations rather than to differences in the treatment. As an example, in a hypothetical study conducted in Southern California, a suitable historical control group might consist of Hispanic women. However, if the treatment were intended for males as well as females (including both genders from many other races), a historical control group comprised of Hispanic women would be inappropriate. In addition, if we then were to use a diverse population of males and females of all races for the treatment group only, how would we know that any observed effect was due to the treatment and not simply to the fact that males respond differently from females or that racial differences are playing a role in the response? Thus, the use of a concurrent control group would overcome the difficulties produced by a historical control group.

In addition, in order to avoid potential bias, patients are often blinded as to study conditions (i.e., treatment or control group), when such blinding is possible. It is also preferable to blind the investigator to the study conditions to prevent bias that could invalidate the study conclusions. When both the investigator and the patient are blinded, the trial is called double-blinded. Double-blinding often is possible in drug treatment studies but rarely is possible in medical device trials. In device trials, the patient sometimes can be blinded but the attending physician cannot be.

To illustrate the scientific value of randomized, blinded, controlled, clinical trials, we will describe a real trial that was sponsored by a medical device company that produces and markets catheters. The trial was designed to determine the safety and efficacy of direct myocardial revascularization (DMR). DMR is a clinical procedure designed to improve cardiac circulation (also called perfusion). The medical procedure involves the placement of a catheter in the patient’s heart. A small laser on the tip of the catheter is fired to produce channels in the heart muscle that theoretically promote cardiac perfusion. The end result should be improved heart function in those patients who are suffering from severe symptomatic coronary artery disease.

In order to determine if this theory works in practice, clinical trials were required. Some studies were conducted in which patients were given treadmill tests before and after treatment in order to demonstrate increased cardiac output. Other measures of improved heart function also were considered in these studies. Results indicated promise for the treatment.

However, critics charged that because these trials did not have randomized controls, a placebo effect (i.e., patients improve because of a perceived benefit from knowing that they received a treatment) could not be ruled out. In the DMR DIRECT trial, patients were randomized to a treatment group and a sham control group. The sham is a procedure used to keep the patient blinded to the treatment. In all cases the laser catheter was placed in the heart. The laser was fired in the patients randomized to the DMR treatment group but not in the patients randomized to the control group. This was a single-blinded trial; i.e., none of the patients knew whether or not they received the treatment. Obviously, the physician conducting the procedure had to know which patients were in the treatment and control groups. The patients, who were advised of the possibility of the sham treatment in the informed consent form, of course received standard care for their illness.

At the follow-up tests, everyone involved, including the physicians, was blinded to the group associated with the laser treatment. For a certain period after the data were analyzed, the results were known only to the independent group of statisticians who had designed the trial and then analyzed the data.

These results were released and made public in October 2000. Quoting the press release, “Preliminary analysis of the data shows that patients who received this laser-based therapy did not experience a statistically significant increase in exercise times or a decrease in the frequency and severity of angina versus the control group of patients who were treated medically. An improvement across all study groups may suggest a possible placebo effect.”

As a result of this trial, the potential benefit of DMR was found not to be significant and not worth the added risk to the patient. Companies and physicians looking for effective treatments for these patients must now consider alternative therapies. The trial saved the sponsor, its competitors, the patients, and the physicians from further use of an ineffective and highly invasive treatment.

### 6. Epidemiological Studies

As seen in the foregoing section, clinical trials illustrate one field that requires much biostatistical expertise. Epidemiology is another such field. Epidemiology is defined as the study of the distribution and determinants of health and disease in populations.

Although experimental methods including clinical trials are used in epidemiology, a major group of epidemiological studies use observational techniques that were formalized during the mid-19th century. In his classic work, John Snow reported on attempts to investigate the source of a cholera outbreak that plagued London in 1849. Snow hypothesized that the outbreak was associated with polluted water drawn from the Thames River. Both the Lambeth Company and the Southwark and Vauxhall Company provided water inside the city limits of London. At first, both the Lambeth Company and the Southwark and Vauxhall Company took water from a heavily polluted section of the Thames River.

The Broad Street area of London provided an excellent opportunity to test this hypothesis because households in the same neighborhood were served by interdigitating water supplies from the two different companies. That is, households in the same geographic area (even adjacent houses) received water from the two companies. This observation by Snow made it possible to link cholera outbreaks in a particular household with one of the two water sources.

Subsequently, the Lambeth Company relocated its water source to a less conta minated section of the river. During the cholera outbreak of 1854, Snow demonstrated that a much greater proportion of residents who used water from the more polluted source contracted cholera than those who used water from the less polluted source. Snow’s method, still in use today, came to be known as a natural experiment [see Friis and Sellers (1999) for more details].

Snow’s investigation of the cholera outbreak illustrates one of the main approaches of epidemiology—use of observational studies. These observational study designs encompass two major categories: descriptive and analytic. Descriptive studies attempt to classify the extent and distribution of disease in populations. In contrast, analytic studies are concerned with causes of disease. Descriptive studies rely on a variety of techniques: (1) case reports, (2) astute clinical observations, and (3) use of statistical methods of description, e.g., showing how disease frequency varies in the population according to demographic variables such as age, sex, race, and socioeconomic status.

For example; Morbidity and Mortality Reports, published by the Centers for Disease Control (CDC) in Atlanta, periodically issues data on persons diagnosed with acquired immune deficiency syndrome (AIDS) classified according to demographic subgroups within the United State. With respect to HIV and AIDS, these descriptive studies are vitally important for showing the nation’s progress in controlling the AIDS epidemic, identifying groups at high risk, and suggesting needed health care services and interventions. Descriptive studies also set the stage for analytic studies by suggesting hypotheses to be explored in further research.

Snow’s natural experiment provides an excellent example of both descriptive and analytic methodology. The reader can probably think of many other examples that would interest statisticians. Many natural experiments are the consequences of government policies. To illustrate, California has introduced many innovative laws to control tobacco use. One of these, the Smoke-free Bars Law, has provided an excellent opportunity to investigate the health effects of prohibiting smoking in alcohol-serving establishments. Natural experiments create a scenario for researchers to test causal hypotheses. Examples of analytic research designs include ecological, case-control, and cohort studies.

We previously defined case-control (Section 1.3.2, Retrospective Studies) and cohort studies (Section 1.3.3, Prospective Studies). Case-control studies have been used in such diverse naturally occurring situations as exploring the causes of toxic shock syndrome among tampon users and investigating diethylstibesterol as a possible cause of birth defects. Cohort studies such as the famous Framingham Study have been used in the investigation of cardiovascular risk factors.

Finally, ecologic studies involve the study of groups, rather than the individual, as the unit of analysis. Examples are comparisons of national variations in coronary heart disease mortality or variations in mortality at the census tract level. In the former example, a country is the “group,” whereas in the latter, a census tract is the group. Ecologic studies have linked high fat diets to high levels of coronary heart disease mortality. Other ecologic studies have suggested that congenital malformations may be associated with concentrations of hazardous wastes.

### 7. Pharmacoeconomic Studies and Quality of Life

Pharmacoeconomics examines the tradeoff of cost versus benefit for new drugs. The high cost of medical care has caused HMOs, other health insurers, and even some regulatory bodies to consider the economic aspects of drug development and marketing. Cost control became an important discipline in the development and marketing of drugs in the 1990s and will continue to grow in importance during the current century. Pharmaceutical companies are becoming increasingly aware of the need to gain expertise in pharmacoeconomics as they start to implement cost control techniques in clinical trials as part of winning regulatory approvals and, more importantly, convincing pharmacies of the value of stocking their products. The everincreasing cost of medical care has led manufacturers of medical devices and pharmaceuticals to recognize the need to evaluate products in terms of cost versus effectiveness in addition to the usual efficacy and safety criteria that are standard for regulatory approvals. The regulatory authorities in many countries also see the need for these studies.

Predicting the cost versus benefit of a newly developed drug involves an element of uncertainty. Consequently, statistical methods play an important role in such analyses. Currently, there are many articles and books on projecting the costs versus benefits in new drug development. A good starting point is Bootman (1996). One of the interesting and important messages from Bootman’s book is the need to consider a perspective for the analysis. The perceptions of cost/benefit tradeoffs differ depending on whether they are seen from the patient’s perspective, the physician’s perspective, society’s perspective, an HMO’s perspective, or a pharmacy’s perspective. The perspective has an important effect on which drug-related costs should be included, what comparisons should be made between alternative formulations, and which type of analysis is needed. Further discussion of cost/benefit trade-offs is beyond the scope of this text. Nevertheless, it is important for health scientists to be aware of such tradeoffs.

Quality of life has played an increasing role in the study of medical treatments for patients. Physicians, medical device companies, and pharmaceutical firms have started to recognize that the patient’s own feeling of well-being after a treatment is as important or more important than some clinically measurable efficacy parameters. Also, in comparing alternative treatments, providers need to realize that many products are basically equivalent in terms of the traditional safety and efficacy measures and that what might set one treatment apart from the others could be an increase in the quality of a patient’s life. In the medical research literature, you will see many terms that all basically deal with the patients’ view of the quality of their life. These terms and acronyms are quality of life (QoL), health related quality of life (HRQoL), outcomes research, and patient reported outcomes (PRO).

Quality of life usually is measured through specific survey questionnaires. Re-searchers have developed and validated many questionnaires for use in clinical trials to establish improvements in aspects of patients’ quality of life. These questionnaires, which are employed to assess quality of life issues, generate qualitative data.

In Chapter 12, we will introduce you to research that involves the use of statistical analysis measures for qualitative data. The survey instruments, their validation and analysis are worthy topics for an entire book. For example, Fayers and Machin (2000) give an excellent introduction to this subject matter.

In conclusion, Chapter 1 has presented introductory material regarding the field of statistics. This chapter has illustrated how statistics are important in everyday life and, in particular, has demonstrated how statistics are used in the health sciences. In addition, the chapter has reviewed major job roles for statisticians. Finally, information was presented on major categories of study designs and sources of health data that statisticians may encounter. Tables 1.1 through 1.3 review and summarize the key points presented in this chapter regarding the uses of statistics, job roles for statisticians, and sources of health data.

### Table 1.1. Uses of Statistics in Health Sciences

1. Interpret research studies

Example: Validity of findings of health education and medical research

2. Evaluate statistics used every day

Examples: Hospital mortality rates, prevalence of infectious diseases

3. Presentation of data to audiences

Effective arrangement and grouping of information and graphical display of data

4. Illustrate central tendency and variability

5. Formulate and test hypotheses

Generalize from a sample to the population.

### Table 1.2. What Do Statisticians Do?

1. Guide design of an experiment, clinical trial, or survey

2. Formulate statistical hypotheses and determine appropriate methodology

3. Analyze data

4. Present and interpret results

### Table 1.3. Sources of Health Data.

1. Archival and vital statistics records

2. Experiments

3. Medical research studies

Retrospective—case control

Prospective—cohort study

4. Descriptive surveys

5. Clinical trials

Related Topics