Standards for Reporting UniODA Findings Expanded to Include ESP and All Possible Aggregated Confusion Tables

Paul R. Yarnold, Ph.D.

Optimal Data Analysis, LLC

UniODA models maximize Effect Strength for Sensitivity (ESS), a normed measure of classification accuracy (0=chance, 100=perfect classification) that indexes the models ability to accurately identify the members of different class categories in the sample. In a study discriminating genders, for example, the percent of each gender accurately classified by the model is indexed using ESS. Unlike ESS, the Effect Strength for Predictive Value (ESP) varies across base-rate. Measured using the identical scale as ESS, ESP indexes the models ability to produce accurate classifications. In the study discriminating genders, for example, the percent of the time the model made an accurate prediction that an observation was either male or female is indexed using ESP. While ESS is important in helping to guide the development and testing of theory, ESP is important in translating theory from laboratory to real-world applications, and is thus added to the recommended minimum standards for reporting of all UniODA findings. In addition, the evaluation of all possible aggregated confusion tables aids in interpreting UniODA findings, and evaluating the potential for increasing classification accuracy by improving measurement of ordered class variables and/or attributes, and so was also added as a recommended minimum standard. Current standards are demonstrated using three examples: (1) using income to discriminate gender in a sample of 416 general internal medicine (GIM) patients, testing the a priori hypothesis that men have higher income than women; (2) using body mass index (BMI) to discriminate income in a sample of 411 GIM patients, testing the a priori hypothesis that BMI and income are positively related; and (3) discriminating mental focus using GHA (a measure of barometric pressure) in a post hoc analysis of 297 sequential daily entries of a fibromyalgia patient using an intelligent health diary, that were separated into training and hold-out validity samples.

View journal article