Advantages and Disadvantages of Nonparametric Versus Parametric Methods

Chapter: Biostatistics for the Health Sciences: Nonparametric Methods

By parametric we mean that they are based on probability models for the data that involve only a few unknown values, called parameters, which refer to measurable characteristics of populations.

<< Prev Page

Next Page >>

Nonparametric Methods

ADVANTAGES AND DISADVANTAGES OF NONPARAMETRIC VERSUS PARAMETRIC METHODS

With the exception of the bootstrap, the techniques covered in the first 13 chapters are all parametric techniques. By parametric we mean that they are based on probability models for the data that involve only a few unknown values, called parameters, which refer to measurable characteristics of populations. Usually, the parametric model that we have used has been the normal distribution; the unknown parameters that we attempt to estimate are the population mean μ and the population variance σ².

However, many tests (e.g., the F test to determine equal variances), and estimating methods (e.g., the least squares solution to linear regression problems) are sensitive to parametric modeling assumptions. These procedures can be shown in theory to be optimal when the parametric model is correct, but inaccurate or misleading when the model does not hold, even approximately.

Procedures that are not sensitive to the parametric distribution assumptions are called robust. Student’s t test for differences between two means when the populations are assumed to have the same variance is robust, because the sample means in the numerator of the test statistic are approximately normal by the central limit theorem.

With nonparametric techniques, the distribution of the test statistic under the null hypothesis has a sampling distribution for the observed data that does not depend on any unknown parameters. Consequently, these tests do not require an assumption of a parametric family. As an example, the sign test for the paired difference between two population medians has a test statistic, T, which equals the number of positive differences between pairs. T has a binomial distribution with parameters n = sample size and p = 1/2 under the null hypothesis that the medians are equal. Note that this sampling distribution for the test statistic is completely known under the null hypothesis since the sample size is given and p = 1/2. There are no unknown parameters that need to be estimated from the data. The sign test is explained in Section 14.5.

The lack of dependence on parametric assumptions is the advantage of nonparametric tests over parametric ones. Nonparametric tests preserve the significance level of the test regardless of the distribution of the data in the parent population.

When a parametric family is appropriate, the price one pays for a distribution-free test is a loss in power in comparison to the parametric test. Also, in generating the test statistic for a nonparametric procedure, we may throw out useful information. For example, the most common popular tests covered in this chapter are rank tests, which keep only the ranks of the observations and not their numerical values.

In the next section, we will show you how to rank the data in rank tests. Examples of these tests are the Wilcoxon rank-sum test, the Wilcoxon signed-rank test, and the Kruskal–Wallis test. Conover (1999) has written an excellent text on the applications of nonparametric methods.

<< Prev Page

Next Page >>

Advantages and Disadvantages of Nonparametric Versus Parametric Methods

Chapter: Biostatistics for the Health Sciences: Nonparametric Methods

Decomposing the Variance and Its Meaning

Necessary Assumptions

F Distribution and Applications

Multiple Comparisons

Exercises questions answers

Advantages and Disadvantages of Nonparametric Versus Parametric Methods

Procedures for Ranking Data

Wilcoxon Rank-Sum Test

Wilcoxon Signed-Rank Test

Sign Test

Kruskal–Wallis Test: One-Way ANOVA by Ranks

Spearman’s Rank-Order Correlation Coefficient

Permutation Tests