MegaODA Large Sample and BIG DATA Time Trials: Separating the Chaff

Robert C. Soltysik & Paul R. Yarnold

Optimal Data Analysis, LLC

Just-released MegaODA™ software is capable of conducting UniODA analysis for an unlimited number of attributes using samples as large as one million observations. To minimize the computational burden associated with Monte Carlo simulation used to estimate the Type I error rate (p), the first step in statistical analysis is identifying effects that are not statistically significant or ns. This article presents an experimental simulation exploring the ability of MegaODA to identify ns effects in a host of designs involving a binary class variable, under ultimately challenging discrimination conditions (all data are random) for sample sizes of n=100,000 and n=1,000,000. Most analyses were solved in CPU seconds running MegaODA on a 3 GHz Intel Pentium D microcomputer. Using MegaODA it is straightforward to rapidly rule-out ns effects using Monte Carlo simulation with BIG DATA for large numbers of attributes in simple or complex, single- or multiple-sample designs involving categorical or ordered attributes either with or without weights being applied to individual observations.

View journal article