# Bayesian Methods

| Home | | Advanced Mathematics |

## Chapter: Biostatistics for the Health Sciences: Tests of Hypotheses

The Bayesian paradigm provides an approach to statistical inference that is different from the methods we have considered thus far.

BAYESIAN METHODS

The Bayesian paradigm provides an approach to statistical inference that is different from the methods we have considered thus far. Although the topic is not commonly taught in introductory statistical courses, we believe that Bayesian methods deserve coverage in this text. Despite the fact that the basic idea goes back to Thomas Bayes’ treatise written more that 200 years ago, the use of the Bayesian idea as a tool of inference really took place mostly in the 20th century. There are now many books on the subject, even though it was not previously in favor among mainstream statisticians.

In the 1990s, Bayesian methods had a rebirth in popularity with the advent of fast computational techniques (especially the Markov chain Monte Carlo approach-es), which allowed computation of general posterior probability distributions that had been difficult or impossible to compute (or approximate) previously. Posterior distributions will be defined shortly. Bayesian hierarchical methods now are being used in medical device submissions to the FDA.

A good introductory text that provides the Bayesian prospective was authored by Berry (1996). Bayesian hierarchical models also are used as a method for doing meta-analyses (as described from the frequentist approach in the previous section). An excellent treatment of use of meta-analyses (Bayesian approaches) in many medical applications is given in Stangl and Berry (2000), which we mentioned in the previous section.

Basically, in the Bayesian approach to inference, the unknown parameters are treated as random quantities with probability distributions to describe their uncer-tainty. Prior to collecting data, a distribution called the prior distribution is chosen to describe our belief about the possible values of the parameters.

Although Bayesian analysis is simple when there is only one parameter, often we are interested in more than one parameter. In addition, one or more nuisance pa-rameters may be involved, as is the case in frequentist inference about a mean when the variance is unknown. In this instance, the mean is the parameter of interest and the variance is a nuisance parameter. In frequentist analysis, we estimate the vari-ance from the data and use it to form a t statistic whose frequency distribution does not depend on the nuisance parameter. In the Bayesian approach, we determine a bivariate prior distribution for the mean and variance; we use Bayes’ rule and the data to construct a bivariate posterior distribution for the mean and variance; then we integrate over the values for the variance to obtain a marginal posterior distribu-tion for the mean.

Bayes’ rule is simply a mathematical formula that says that you find the posteri-or distribution for a parameter θ by taking the prior distribution for θ and multiplying it by the likelihood for the data given a specified value of θ. For the mean, this likelihood can be regarded as the sample distribution for when the population variance is assumed to be known and the population mean is a specified μ. We know by the central limit theorem that this distribution is approximately normal with mean and variance σ2/n, where σ2 is the known variance and n is the sample size. The density function for this normal distribution is the likelihood. We multiply the likelihood by the prior density for to get the posterior density, called the posterior density of μ given the sample mean .

There is controversy among the schools of statistical inference (Bayesian and fre-quentist). With respect to the Bayesian approach, the controversy involves the treatment of μ as a random quantity with a prior distribution. In the discrete case, it is a simple law of conditional probabilities that if X and Y are two random quantities, then P[X = x|Y = y] = P[X = x, Y = y]/P[Y = y] = P[Y = y|X = x]P[X = x]/P[Y = y]. Now, P[Y = y] = Σx P[Y = y, X = x]. This leads to Bayes’ rule, the uncontroversial mathematical result that P[X = x|Y = y] = P[Y = y|X = x]P[X = x]/ Σx P[Y = y, X = x].

In the problem of a population mean, the Bayesian followers take X to be the population mean and Y the sample estimate. The left-hand side of the above equa-tion {P[X = x| Y = y]} is the posterior distribution (or density) for X, and the right-hand side is the appropriately scaled likelihood for Y, given X (P[Y = y|X = x]/ Σx P[Y = y, X = x]) multiplied by the prior distribution (or density) for X at x (namely, P[X = x]). The formula applies for continuous or discrete random quantities but is derived more easily in the discrete case. The mathematics cannot be disputed, but one can question philosophically the existence of a prior distribution for X when X is an unknown parameter of a probability distribution.

Point estimates of parameters usually are obtained by taking the mode of the posterior distribution (but means or medians also can be used). The analog to the confidence interval is called a credible region and is obtained by finding points a and b such that the posterior probability that the parameter μ falls in the interval [a, b] is set at a value such as 0.95. Points a and b are not unique and generally are chosen on grounds of symmetry. Sometimes the points are selected optimally in order to make the width of the interval as short as possible.

For hypothesis testing, one constructs an odds ratio for the alternative hypothesis relative to the null hypothesis as a prior distribution and then applies Bayes’ rule to construct a posterior odds ratio given the test data. That is, we have a distribution for the ratio of the probability that the alternative is true to the probability that the null hypothesis is true. Before collecting the data, one specifies how large this ratio should be in order to reject the null hypothesis. See Berry (1996) for more details and examples.

Markov chain Monte Carlo methods now have made it computationally feasible to choose realistic prior distributions and solve hierarchical Bayesian problems. This development has led to a great deal of statistical research using the Bayesian approach to solve problems. Most researchers are using the software Winbugs and associated diagnostics to solve Bayesian problems. Developed in the United King-dom, this software is free of charge. See Chapter 16 for details on Winbugs.