Ariel Linden, Fred B. Bryant & Paul R. Yarndol
Linden Consulting Group, LLC, Loyola University Chicago & Optimal Data Analysis, LLC
Recent research compared the ability of various classification algorithms [logistic regression (LR), random forests (RF), support vector machines (SVM), boosted regression (BR), multi-layer perceptron neural net model (MLP), and classification tree analysis (CTA)] to correctly fail to identify a relationship between a binary class (dependent) variable and ten randomly generated attributes (covariates): only CTA failed to find a model. We use the same ten-variable N=1,000 dataset to assess training classification accuracy of models developed by logistic discriminant analysis (LDA), generalized structural equation modelling (GSEM), and robust diagonally-weighted least-squares (DWLS) SEM for binary outcomes. Except for CTA, all machine-learning algorithms assessed thus far have identified training effects in random data.