# Types of Data

| Home | | Advanced Mathematics |

## Chapter: Biostatistics for the Health Sciences: Systematic Organization and Display of Data

The methods for displaying and analyzing data depend upon the type of data being used.

TYPES OF DATA

The methods for displaying and analyzing data depend upon the type of data being used. In this section, we will define and provide examples of the two major types of data: qualitative and quantitative. Quantitative data can be continuous or discrete. Chapter 11 will give more information about the related topic of measurement systems. We collect data to characterize populations and to estimate parameters, which are numerical or categorical characteristics of a population probability distribution.

In order to describe types of data, we need to be familiar with the concept of variables. The term “variable” is used to describe a quantity that can vary (i.e., take on various values), such as age, height, weight, or sex. Variables can be characteris-tics of a population, such as the age of a randomly selected individual in the U.S. population. They can also be estimates (statistics) of population parameters such as the mean age of a random sample of 100 individuals in the U.S. population. These variables will have probability distributions associated with them and these distrib-utions will be discussed in Chapter 5.

## 1. Qualitative Data

Variables that can be identified for individuals according to a quality are called qualitative variables. These variables place individuals into categories that do not have numerical values. When the observations are not ordered, they form a nominal scale. (A dichotomous scale—true/false, male/female, yes/no, dead/alive—also is a nominal scale.) Many qualitative variables cannot be ordered (as in going from worst to best). Occupation, marital status, and sex are examples of qualitative data that have no natural ordering. The term nominal refers to qualitative data that do not have a natural ordering.

Some qualitative data can be ordered in the manner of a preference scale (e.g., strongly agree, agree, disagree, strongly disagree). Levels of educational attainment can be ordered from low to moderate to high: less than a high school education might be categorized as low; education beyond high school but without a four year bachelor’s degree could be considered moderate; a four year bachelor’s degree might be considered high; and a degree at the masters, Ph.D., or M.D. level consid-ered very high. Although still considered qualitative, categorical data that can be or-dered are called ordinal.

Qualitative data can be summarized and displayed in pie charts and bar graphs, which describe the frequency of occurrence in the sample or the population of par-ticular values of the characteristics. These graphical representations will be de-scribed in Section 3.3. For ordinal data with the categories ordered from lowest to highest, bar graphs might be more appropriate than pie charts. Because a pie chart is circular, it is more appropriate for nominal data.

## 2. Quantitative Data

Quantitative data are numerical data that have a natural order and can be continuous or discrete. Continuous data can take on any real value in an interval or over the whole real number line. Continuous data can be classified as interval. Continuous data also can be summarized with box-and-whisker plots, histograms, frequency polygons, and stem-and-leaf displays. Examples of continuous data include vari-ables such as age, height, weight, heart rate, blood pressure, and cholesterol level.

Discrete data take on only a finite or countable (equivalent to the set of integers) number of values. Examples of discrete data are the number of children in a house-hold, the number of visits to a doctor in a year, or the number of successful ablation treatments in a clinical trial. Often, discrete data are integers or fractions. Discrete data can be described and displayed in histograms, frequency polygons, stem-and-leaf displays, and box-and-whisker plots (see Section 3.3).

If the data can be ordered, and we can identify ratios with them, we call the data ratio data. For example, integers form a quantitative discrete set of numbers that are ratio data; we can quantify 2 as being two times 1, 4 as two times 2, and 6 as three times 2. The ability to create ratios distinguishes quantitative data from qualitative data. Qualitative ordinal data can be ordered but cannot be used to produce ratios. We cannot say, for example, that a college education is worth twice as much as a high school education.

Continuous interval data can be used to produce ratios but not all ratio data are continuous. For example, the integers form a discrete set that can produce ratios, but such data are not interval data because of the gaps between consecutive inte-gers.