# Why Study Statistics?

| Home | | Advanced Mathematics |

## Chapter: Biostatistics for the Health Sciences: What is Statistics? How Is It Applied to the Health Sciences?

Technological advances continually make new disease prevention and treatment possibilities available for health care.

WHY STUDY STATISTICS?

Technological advances continually make new disease prevention and treatment possibilities available for health care. Consequently, a substantial body of medical research explores alternative methods for treating diseases or injuries. Because out-comes vary from one patient to another, researchers use statistical methods to quan tify uncertainty in the outcomes, summarize and make sense of data, and compare the effectiveness of different treatments. Federal government agencies and private companies rely heavily on statisticians’ input.

The U.S. Food and Drug Administration (FDA) requires manufacturers of new drugs and medical devices to demonstrate the effectiveness and safety of their products when compared to current alternative treatments and devices. Because this process requires a great deal of statistical work, these industries employ many statisticians to design studies and analyze the results. Controlled clinical trials, described later in this chapter, provide a commonly used method for assessing product efficacy and safety. These trials are conducted to meet regulatory requirements for the market release of the products. The FDA considers such trials to be the gold standard among the study approaches that we will cover in this text.

Medical device and pharmaceutical company employees—clinical investigators and managers, quality engineers, research and development engineers, clinical research associates, database managers, as well as professional statisticians—need to have basic statistical knowledge and an understanding of statistical terms. When you consider the following situations that actually occurred at a medical device company, you will understand why a basic knowledge of statistical methods and terminology is important.

Situation 1: You are the clinical coordinator for a clinical trial of an ablation catheter (a catheter that is placed in the heart to burn tissue in order to eliminate an electrical circuit that causes an arrhythmia). You are enrolling patients at five sites and want to add a new site. In order to add a new site, a local review board called an institution review board (IRB) must review and approve your trial protocol.

A member of the board asks you what your stopping rule is. You do not know what a stopping rule is and cannot answer the question. Even worse, you do not even know who can help you. If you had taken a statistics course, you might know that many trials are constructed using group sequential statistical methods. These methods allow for the data to be compared at various times during the trial. Thresholds that vary from stage to stage determine whether the trial can be stopped early to declare the device safe and/or effective. They also enable the company to recognize the futility of continuing the trial (for example, because of safety concerns or because it is clear that the device will not meet the requirements for efficacy). The sequence of such thresholds is called the stopping rule.

The IRB has taken for granted that you know this terminology. However, group sequential methods are more common in pharmaceutical trials than in medical device trials. The correct answer to the IRB is that you are running a fixed-sample-size trial and, therefore, no stopping rule is in effect. After studying the material in this book, you will be aware of what group sequential methods are and know what stopping rules are.

Situation 2: As a regulatory affairs associate at a medical device company that has completed a clinical trial of an ablation catheter, you have submitted a regulatory report called a premarket approval application (PMA). In the PMA, your statistician has provided statistical analyses for the study endpoints (performance measures used to demonstrate safety or effectiveness).

The reviewers at the Food and Drug Administration (FDA) send you a letter with questions and concerns about deficiencies that must be addressed before they will approve the device for marketing. One of the questions is: “Why did you use the Greenwood approximation instead of Peto’s method?” The FDA prefers Peto’s method and would like you to compute the results by using that method.

You recognize that the foregoing example involves a statistical question but have no idea what the Greenwood and Peto methods are. You consult your statistician, who tells you that she conducted a survival analysis (a study of treatment failure as a function of time across the patients enrolled in the study). In the survival analysis, time to recurrence of the arrhythmia is recorded for each patient. As most patients never have a recurrence, they are treated as having a rightcensored recurrence time (their time to event is cut off at the end of the trial or the time of the analysis).

Based on the data, a Kaplan–Meier curve, the common nonparametric estimate for the survival curve, is generated. The survival curve provides the probability that a patient will not have a recurrence by time t. It is plotted as a function of t and decreases from 1 at time 0. The Kaplan–Meier curve is an estimate of this survival curve based on the trial data (survival analysis is covered in Chapter 15).

You will learn that the uncertainty in the Kaplan–Meier curve, a statistical estimate, can be quantified in a confidence interval (covered in general terms in Chapter 8). The Greenwood and Peto methods are two approximate methods for placing confidence intervals on the survival curve at specified times t. Statistical research has shown that the Greenwood method often provides a lower confidence bound estimate that is too high. In contrast, the Peto method gives a lower and possibly better estimate for the lower bound, particularly when t is large. The FDA prefers the bound obtained by the Peto method because for large t, most of the cases have been rightcensored. However, both methods are approximations and neither one is “correct.”

From the present text, you will learn about confidence bounds and survival distributions; eventually, you will be able to compute both the Greenwood and Peto bounds. (You already know enough to respond to the FDA question, “Why did you use the Greenwood approximation . . . ?” by asking a statistician to provide the Peto lower bound in addition to the Greenwood.)

Situation 3: Again, you are a regulatory affairs associate and are reviewing an FDA letter about a PMA submission. The FDA wants to know if you can present your results on the primary endpoints in terms of confidence intervals instead of just reporting p-values (the p-value provides a summary of the strength of evidence against the null hypothesis and will be covered in Chapter 9). Again, you recognize that the FDA’s question involves statistical issues.

When you ask for help, the statistician tells you that the p-value is a summary of the results of a hypothesis test. Because the statistician is familiar with the test and the value of the test statistic, he can use the critical value(s) for the test to generate a confidence bound or confidence bounds for the hypothesized parameter value. Consequently, you can tell the FDA that you are able to provide them with the information they want.

The present text will teach you about the one-to-one correspondence between hypothesis tests and confidence intervals (Chapter 9) so that you can construct a hy-pothesis test based on a given confidence interval or construct the confidence bounds based on the results of the hypothesis test.

Situation 4: You are a clinical research associate (CRA) in the middle of a clinical trial. Based on data provided by your statistics group, you are able to change your chronic endpoint from a six-month follow-up result to a three-month follow-up result. This change is exciting because it may mean that you can finish the trial much sooner than you anticipated. However, there is a problem: the original protocol required follow-ups only at two weeks and at six months after the procedure, whereas a three-month follow-up was optional.

Some of the sites opt not to have a three-month follow-up. Your clinical manager wants you to ask the investigators to have the patients who are past three months postprocedure but not near the six-month follow-up come in for an unscheduled follow-up. When the investigator and a nurse associate hear about this request, they are reluctant to go to the trouble of bringing in the patients. How do you convince them to comply?

You ask your statistician to explain the need for an unscheduled follow-up. She says that the trial started with a six-month endpoint because the FDA viewed six months to be a sufficient duration for the trial. However, an investigation of Kaplan–Meier curves for similar studies showed that there was very little decrease in the survival probability in the period from three to six months. This finding convinced the FDA that the three-month endpoint would provide sufficient information to determine the long-term survival probability.

The statistician tells the investigator that we could not have put this requirement into the original protocol because the information to convince the FDA did not exist then. However, now that the FDA has changed its position, we must have the three-month information on as many patients as possible. By going to the trouble of bringing in these patients, we will obtain the information that we need for an early approval. The early approval will allow the company to market the product much faster and allow the site to use the device sooner. As you learn about survival curves in this text, you will appreciate how greatly survival analyses impact the success of a clinical trial.

Situation 5: You are the Vice President of the Clinical and Regulatory Affairs Department at a medical device company. Your company hired a contract research organization (CRO) to run a randomized controlled clinical trial (described in Section 1.3.5, Clinical Trials). A CRO was selected in order to maintain complete objectivity and to guarantee that the trial would remain blinded throughout. Blinding is a procedure of coding the allocation of patients so that neither they nor the investigators know to which treatment the patients were assigned in the trial.

You will learn that blinding is important to prevent bias in the study. The trial has been running for two years. You have no idea how your product is doing. The CRO is nearing completion of the analysis and is getting ready to present the report and unblind the study (i.e., let others know the treatment group assignments for the patients). You are very anxious to know if the trial will be successful. A successful trial will provide a big financial boost for your company, which will be able to market this device that provides a new method of treatment for a particular type of heart disease.

The CRO shows you their report because you are the only one allowed to see it until the announcement, two weeks hence. Your company’s two expert statisticians are not even allowed to see the report. You have limited statistical knowledge, but you are accustomed to seeing results reported in terms of p-values for tests. You see a demographic analysis comparing patients by age and gender in the treatment and the control groups. As the p-value is 0.56, you are alarmed, for you are used to seeing small p-values. You know that, generally, the FDA requires p-values below 0.05 for acceptance of a device for marketing. There is nothing you can do but worry for the next two weeks.

If you had a little more statistical training or if you had a chance to speak to your statistician, you may have heard the following: Generally, hypothesis tests are set up so that the null hypothesis states that there is no difference among groups; you want to reject the null hypothesis to show that results are better for the treatment group than for the control group. A low p-value (0.05 is usually the threshold) indicates that the results favor the treatment group in comparison to the control group. Conversely, a high p-value (above 0.05) indicates no significant improvement.

However, for the demographic analysis, we want to show no difference in out-come between groups by demographic characteristics. We want the difference in the value for primary endpoints (in this case, length of time the patient is able to exercise on a treadmill three months after the procedure) to be attributed to a difference in treatment. If there are demographic differences between groups, we cannot determine whether a statistically significant difference in performance between the two groups is attributable to the device being tested or simply to the demographic differences. So when comparing demographics, we are not interested in rejecting the null hypothesis; therefore, high p-values provide good news for us.

From the preceding situations, you can see that many employees at medical device companies who are not statisticians have to deal with statistical issues and terminology frequently in their everyday work. As students in the health sciences, you may aspire to career positions that involve responsibilities and issues that are similar to those in the foregoing examples. Also, the medical literature is replete with research articles that include statistical analyses or at least provide p-values for certain hypothesis tests. If you need to study the medical literature, you will need to evaluate some of these statistical results. This text will help you become statistically literate. You will have a basic understanding of statistical techniques and the assumptions necessary for their application.

We noted previously that in recent years, medically related research papers have included more and increasingly sophisticated statistical analyses. However, some medical journals have tended to have a poor track record, publishing papers that contain various errors in their statistical applications. See Altman (1991), Chapter 16, for examples.

Another group that requires statistical expertise in many situations is comprised of public health workers. For example, they may be asked to investigate a disease outbreak (such as a food-borne disease outbreak). There are five steps (using statistics) required to investigate the outbreak: First, collect information about the persons involved in the outbreak, deciding which types of data are most appropriate. Second, identify possible sources of the outbreak, for example, contaminated or improperly stored food or unsafe food handling practices. Third, formulate hypotheses about modes of disease transmission. Fourth, from the collected data, develop a descriptive display of quantitative information (see Chapter 3), e.g., bar charts of cases of occurrence by day of outbreak. Fifth, assess the risks associated with certain types of exposure (see Chapter 11).

Health education is another public health discipline that relies on statistics. A central concern of health education is program evaluation, which is necessary to demonstrate program efficacy. In conjunction with program evaluation, health educators decide on alternative statistical tests, including (but not limited to) independent groups or paired groups (paired t-tests or nonparametric analogues) chisquare tests, or one-way analyses of variance. In designing a needs assessment protocol, health educators conduct a power analysis for sample surveys. Not to be minimized is the need to be familiar with the plethora of statistical techniques employed in contemporary health education and public health literature.

The field of statistics not only has gained importance in medicine and closely related disciplines, as we have described in the preceding examples, but it has become the method of choice in almost all scientific investigations. Salsburg’s recent book “The Lady Tasting Tea” (Salsburg, 2001) explains eloquently why this is so and provides a glimpse at the development of statistical methodology in the 20th century, along with the many famous probabilists and statisticians who developed the discipline during that period. Salsburg’s book also provides insight as to why (possibly in some changing form) the discipline will continue to be important in the 21st century. Random variation just will not go away, even though deterministic theories (i.e., those not based on chance factors) continue to develop.

The examples described in this section are intended to give you an overview of the importance of statistics in all areas of medically related disciplines. The examples also highlight why all employees in the medical field can benefit from a basic understanding of statistics. However, in certain positions a deeper knowledge of statistics is required. These examples were intended to give you an understanding of the importance of statistics in realistic situations. We have pointed out in each situation the specific chapters in which you will learn more details about the relevant statistical topics. At this point, you are not expected to understand all the details regarding the examples, but by the completion of the text, you will be able to review and reread them in order to develop a deeper appreciation of the issues involved.

Related Topics