# Correlation, Linear Regression, and Logistic Regression

| Home | | Advanced Mathematics |

## Chapter: Biostatistics for the Health Sciences: Correlation, Linear Regression, and Logistic Regression

The previous chapter presented various chi-square tests for determining whether or not two variables that represented categorical measurements were significantly associated.

Correlation, Linear Regression, and Logistic Regression

The previous chapter presented various chi-square tests for determining whether or not two variables that represented categorical measurements were significantly associated. The question arises about how to determine associations between variables that represent higher levels of measurement. This chapter will cover the Pear-son product moment correlation coefficient (Pearson correlation coefficient or Pearson correlation), which is a method for assessing the association between two variables that represent either interval- or ratio-level measurement.

Remember from the previous chapter that examples of interval level measurement are Fahrenheit temperature and I.Q. scores; ratio level measures include blood pressure, serum cholesterol, and many other biomedical research variables that have a true zero point. In comparison to the chi-square test, the correlation coefficient provides additional useful information—namely, the strength of association between the two variables.

We will also see that linear regression and correlation are related because there are formulas that relate the correlation coefficient to the slope parameter of the regression equation.. In contrast to correlation, linear regression is used for predicting status on a second variable (e.g., a dependent variable) when the value of a predictor variable (e.g., an independent variable) is known.

Another technique that provides information about the strength of association between a predictor variable (e.g., a risk factor variable) and an outcome variable (e.g., dead or alive) is logistic regression. In the case of a logistic regression analysis, the outcome is a dichotomy; the predictor can be selected from variables that represent several levels of measurement (such as categorical or ordinal), as we will demonstrate in Section 12.9. For example, a physician may use a patient’s total serum cholesterol value and race to predict high or low levels of coronary heart dis-ease risk.