General-Purpose Packages

| Home | | Advanced Mathematics |

Chapter: Biostatistics for the Health Sciences: Software Packages for Statistical Analysis

Software packages for statistical analysis have evolved over the past three decades from those designed primarily for mainframe applications to software directed to-ward personal computer users.


Software Packages for Statistical Analysis

GENERAL-PURPOSE PACKAGES

Software packages for statistical analysis have evolved over the past three decades from those designed primarily for mainframe applications to software directed to-ward personal computer users. Examples of statistical packages include BMDP, SPSS, SAS, Splus, Minitab, and a wide variety of other programs. Wilfred Dixon and his colleagues in statistics at the University of California, Los Angeles, pro-duced one of the earliest successful statistical packages, known as BMDP. This package for mainframe computers was so successful in the 1960s and 1970s that eventually BMDP Inc. was founded to handle the production and sale of the soft-ware.

BMDP handled summary statistics, hypothesis testing and confidence intervals, regression, and analysis of variance. The demand for additional statistical routines from biostatisticians led Dixon and his colleagues at UCLA to develop multivariate routines for cluster analysis and classification, as well as survival analysis and time series methods.

However, in the 1980s and 1990s microcomputers and, subsequently, personal computers supplanted mainframes. Because BMDP was slow to make adjustments, the business eventually failed. SPSS Inc. bought the software package for distribu-tion and development in the United States. BMDP’s branch in Cork, Ireland eventu-ally developed into an offshoot company, Statistical Solutions, which still has a li-cense to market and distribute BMDP software in Europe.

Statistical Packages for the Social Sciences (SPSS) was originally a software package developed in the late 1960s at Stanford University to help solve problems in the social sciences. Norman H. Nie, C. Hadlai (Tex) Hull, Dale Bent, and three Stanford University graduate students were the originators. SPSS incorporated in 1975 and established headquarters in Chicago, where the company, headed by Nie as Chairman of the Board, remains today.

A very popular package in the social sciences, SPSS provides standard regres-sion and analysis of variance programs. In addition, it emphasizes multivariate methods that are important to social scientists, e.g., factor analysis, cluster analysis, classification, time series methods, and categorical data analysis. Initially, SPSS suffered because it valued marketing more highly than good numerical algorithms, whereas BMDP excelled at the use of good, stable numerical methods. In recent years SPSS Inc. has improved its algorithms.

SPSS has grown into a large corporation that acquired several major software packages during the period 1994–1999. For example, SPSS bought the rights to BMDP in the United States and bought another good statistical package, SYSTAT, that was developed by Leland Wilkinson. The firm has developed data mining soft-ware products in addition to the standard array of statistical tools. As a result of its acquisitions and software enhancements, the company is now in competition with other major statistical software and data analysis vendors such as SAS. To learn about SPSS and all its products, including SYSTAT, go to their website: www.spss.com.

Academics at North Carolina State University developed the Statistical Analysis System, (SAS) in the late 1960s. Like BMDP, SAS was a software tool devised to handle statistical research problems at a university. SAS became so successful that in 1976 NCSU faculty member James Goodnight, in an agreement with the univer-sity, gained the commercial rights to the software and formed the company that is now called the SAS Institute Inc. SAS software has become the most successful sta-tistical software package of all, due in part to Goodnight’s and the other founders’ ability to anticipate the demands of the marketplace. The SAS Institute has pro-duced excellent numerical algorithms and has been at the forefront in designing software with topnotch data management capabilities. Because of it’s capabilities. SAS is the software of choice for major businesses and the entire pharmaceutical industry. As the personal computer came along, SAS developed PC SAS with a user-friendly Windows interface.

SAS software is divided into modules. The statistics module, called STAT, pro-vides procedures for doing the standard parametric and nonparametric procedures including analysis of variance, regression, classification and clustering, and sur-vival analysis. Specialized procedures such as time series analysis and statistical quality control have their own modules. We demonstrate SAS output in examples in this text because of SAS’s dominant use in industry. SAS is also a programming language that enables you to produce statistical analyses to meet your particular needs and to manipulate your data sets in ways to enhance the analysis.

SAS now invests a lot of its development money in data mining. Their data min-ing package, Enterprise Miner, is one of the best packages currently available. An-other advantage of SAS is its capability to transport data files in various formats and convert them to SAS data sets without tremendous effort on the part of the user.

To learn the latest information about SAS, you can go to its website: www.sas.com.

S is a statistical language that was developed by AT&T Bell Laboratories in the 1970s and 1980s. It was designed to be an object-oriented language conducive to interactive data analysis and research. It is particularly suited for interactive graphics.

In the mid 1980s, R. Douglas Martin and other faculty members at the Universi-ty of Washington formed a software company called Statistical Sciences. The com-pany’s purpose was to create a user-friendly front end for S. The founders called their software Splus. The package has been tremendously popular at universities and other research institutions because it provides state-of-the-art statistical tools with a user-friendly interface so that the user does not have to be knowledgeable about the S language. The company was later bought by Mathsoft and has now changed its name to Insightful Corp.

Splus software is known for its interactive capability. It includes the latest devel-opments in time series, outlier detection, density estimation, nonparametric regres-sion, and smoothing techniques including LOESS and spline function curve esti-mates. Insightful Corp. also has developed classification and regression tree algorithms and a module for group sequential design and analysis. To learn the lat-est about Splus and other products, go to Insightful’s website: www.insightful.com.

Minitab is another general-purpose statistical package. It was designed to facili-tate teaching statistical methods by using computers. Established in 1972, Minitab is used widely in educational applications. The company’s founding statisticians were experts in statistical quality control methods. Consequently, the company prides itself on the usefulness and appropriateness of its quality control tools. Minitab is also a very user-friendly product with good documentation. To learn more about Minitab, go to their website at www.minitab.com.

Other good general-purpose software packages on the market today include STATA and NCSS. Their websites, which provide detailed information on their products, are www.stata.com and www.ncss.com, respectively. NCSS also pro-duces a fine program for determining statistical power and sample size (both dis-cussed in Section 16.3.)

For a detailed account of software packages that are useful in biostatistics, refer to the article “Software” by Arena and Rockette (2001). In addition to providing de-tailed discussion of the tools, the authors provide a very useful and extensive table that gives the title of each package, its emphasis relevant to clinical trials, and the name of the current vendor that sells it (including websites and mailing addresses). This list is very extensive and includes special-purpose as well as general-purpose software.

Bayesian and other statistical techniques are benefiting greatly from the Markov chain Monte Carlo computational algorithms. Refer to Robert and Casella (1999) for an excellent reference on this subject. Spiegelhalter and his colleagues at the MRC Biostatistics Unit in Cambridge, England, developed a software tool called BUGS, which stands for Bayesian inference using Gibbs sampling. Gibbs sampling is a par-ticular type of Markov chain Monte Carlo algorithm, as is the Metropolis–Hastings algorithm. BUGS is also used in Bayesian survival analysis methods, as recently described by Ibrahim, Chen, and Sinha (2001). BUGS, with documentation, can be downloaded at no cost from the Internet (http://www.mrc-bsu.cam.ac.uk/bugs/).

At present, the most commonly used version of BUGS is WinBUGS. This attrac-tive version is menu-driven for the Windows operating system. WinBUGS is well de-scribed with many examples in Congdon (2001). Both the Markov chain Monte Carlo algorithm and the Metropolis–Hastings algorithm can be implemented through WinBUGS. Diagnostic software for convergence of Markov chains, called CODA (Convergence Diagnostics and Output Analysis), by Martin Plummer can be down-loaded at http://www-fis.iarc.fr/coda/. Brian Smith has produced another, more re-cent package, which is available at http://www.public-health.uiowa.edu/boa/.

Contact Us, Privacy Policy, Terms and Compliant, DMCA Policy and Compliant

TH 2019 - 2024 pharmacy180.com; Developed by Therithal info.