Why You Should Avoid Excel

Chapter: Biostatistics for the Health Sciences: Software Packages for Statistical Analysis

The Microsoft product Excel is a very popular and useful spreadsheet program. Excel provides random number generators and functions to generate means, standard deviations, and minima and maxima of a set of numbers in a spreadsheet. It also has a data analysis toolkit as an add-on option. The toolkit provides many standard statistical tools, including regression and analysis of variance.

Many universities, particularly business schools, have considered using Excel for routine statistical analyses and as a tool to teach statistics to undergraduate classes. However, statisticians have discovered numerical instabilities in many of the algorithms. In some versions of Excel, even calculations of means and standard deviations could be incorrect because of blank rows or columns treated as zero in value instead of being ignored. The pseudorandom number generators that are used in Excel are also known to be faulty. Microsoft has not fixed many of the problems that have been pointed out to them. For all of these reasons, we think it is better to export Excel data files to other packages such as SAS before doing even routine statistical analyses.

Academic institutions are tempted to use Excel for statistical analyses. Nowa-days, PCs are owned and used by the schools themselves as well as most of the community. Excel is automatically preinstalled in most of the computers sold to universities and their students. Some universities have site licenses for the distribution of well-known software products. We recommend that you use Excel for typi-cal spreadsheet applications and for graphics such as bar charts, pie charts, and scat-ter plots but not for statistical analyses.

