Univariate and Multivariate Analysis of Categorical Attributes with Many Response Categories

Paul R. Yarnold

Optimal Data Analysis, LLC

A scant few weeks ago disentanglement of effects identified in purely categorical designs in which all variables are categorical, including notoriously-complex rectangular categorical designs (RCDs) in which variables have a different number of response categories, was poorly understood. However, univariate and multivariate optimal (“maximum-accuracy”) statistical methods, specifically UniODA and automated CTA, make the analyses of such designs straightforward. These methods are illustrated using an example involving n=1,568 randomly selected patients having either confirmed or presumed Pneumocystis carinii pneumonia (PCP). Four categorical variables used in analysis include patient status (two categories: alive, dead), gender (male, female), city of residence (seven categories), and type of health insurance (ten categories). Examination of the cross-tabulations of these variables makes it obvious why conventional statistical methods such as chi-square analysis, logistic regression analysis, and log-linear analysis are both inappropriate for, as well as easily overwhelmed by such designs. In contrast, UniODA and CTA identified maximum-accuracy solutions effortlessly in this application.

View journal article