Randomized Blocks Designs: Omnibus vs. Pairwise Comparison, Fixed vs. Relative Optimal Discriminant Threshold, and Raw vs. Ipsative z-Score Measures

Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC This study extends recent research assessing the use of relative thresholds in matched-pairs designs, for a randomized blocks design in which four treatments are randomly assigned to blood samples drawn from each of eight people (each person treated as a … Continue reading Randomized Blocks Designs: Omnibus vs. Pairwise Comparison, Fixed vs. Relative Optimal Discriminant Threshold, and Raw vs. Ipsative z-Score Measures

Using Fixed and Relative Optimal Discriminant Thresholds in Randomized Blocks (Matched-Pairs) Designs

Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC Optimal discriminant analysis (ODA) is often used to compare values of one (or more) attributes between two (or more) groups of observations with respect to a fixed discriminant threshold that maximizes accuracy normed against chance for the sample. However, a … Continue reading Using Fixed and Relative Optimal Discriminant Thresholds in Randomized Blocks (Matched-Pairs) Designs

ODA vs. t-Test: Lysozyme Levels in the Gastric Juice of Patients with Peptic Ulcer vs. Normal Controls

Paul R. Yarnold Optimal Data Analysis, LLC Lysozyme levels in gastric juice of peptic ulcer patients were compared against normal controls by t-test, finding p<0.05. Because standard deviations differed by a factor of two between groups, and were proportional to the means, analysis of natural logarithms was instead deemed appropriate: the resulting t-test wasn’t statistically … Continue reading ODA vs. t-Test: Lysozyme Levels in the Gastric Juice of Patients with Peptic Ulcer vs. Normal Controls

Regression vs. Novometric-Based Assessment of Inter-Examiner Reliability

Paul R. Yarnold Optimal Data Analysis, LLC Four examiners independently recorded the DMFS (decayed, missing, filled surfaces) scores of ten patients. Inter-examiner correspondence of DMFS scores was evaluated using Pearson correlation and novometric analysis. Whereas essentially perfect correlation models were unable to accurately predict DMFS scores in training analysis, novometric models were consistently perfect in … Continue reading Regression vs. Novometric-Based Assessment of Inter-Examiner Reliability

Fixed vs. Relative Optimal Discriminant Thresholds: Pairwise Comparisons of Raters’ Ratings for a Sample

Paul R. Yarnold Optimal Data Analysis, LLC Foundational to the ODA algorithm when used with an ordered attribute is the identification of the optimal threshold—the specific cutpoint that yields the most accurate (weighted) classification solution for a sample of observations. ODA models involving a single optimal threshold will henceforth be called “fixed-threshold” models. This note … Continue reading Fixed vs. Relative Optimal Discriminant Thresholds: Pairwise Comparisons of Raters’ Ratings for a Sample

Logistic Discriminant Analysis and Structural Equation Modeling Both Identify Effects in Random Data

Ariel Linden, Fred B. Bryant & Paul R. Yarndol Linden Consulting Group, LLC, Loyola University Chicago & Optimal Data Analysis, LLC Recent research compared the ability of various classification algorithms [logistic regression (LR), random forests (RF), support vector machines (SVM), boosted regression (BR), multi-layer perceptron neural net model (MLP), and classification tree analysis (CTA)] to … Continue reading Logistic Discriminant Analysis and Structural Equation Modeling Both Identify Effects in Random Data

Multi-Layer Perceptron Neural Net Model Identifies Effect in Random Data

Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC Prior research contrasted the ability of different classification algorithms [logistic regression (LR), random forests (RF), boosted regression (BR), support vector machines (SVM), classification tree analysis (CTA)] to correctly fail to identify a relationship between a binary class (dependent) variable and … Continue reading Multi-Layer Perceptron Neural Net Model Identifies Effect in Random Data

Optimizing Suboptimal Classification Trees: Matlab® CART Model Predicting Probability of Lower Limb Prosthesis User’s Functional Potential

Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC After any algorithm which controls the growth of a classification tree model has completed, the resulting model must be pruned in order to explicitly maximize predictive accuracy normed against chance. This article illustrates manually-conducted maximum-accuracy pruning of a classification and … Continue reading Optimizing Suboptimal Classification Trees: Matlab® CART Model Predicting Probability of Lower Limb Prosthesis User’s Functional Potential

Regression vs. Novometric Analysis Predicting Income Based on Education

Paul R. Yarnold Optimal Data Analysis, LLC This study compares linear regression vs. novometric models of the association of education and income for a sample of 32 observations. Regression analysis identified a relatively strong effect (R-squared=56.4), but only 25% of point predictions fell within a 20% band of actual income. Novometric analysis identified a strong … Continue reading Regression vs. Novometric Analysis Predicting Income Based on Education

Effect of Sample Size on Discovery of Relationships in Random Data by Classification Algorithms

Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC In a recent paper, we assessed the ability of several classification algorithms (logistic regression, random forests, boosted regression, support vector machines, and classification tree analysis [CTA]) to correctly not identify a relationship between the dependent variable and ten covariates generated … Continue reading Effect of Sample Size on Discovery of Relationships in Random Data by Classification Algorithms

ODA vs. χ2, r, and τ: Trauma Exposure in Childhood and Duration of Participation in Eating-Disorder Treatment Program

Paul R. Yarnold Optimal Data Analysis, LLC This note illustrates the disorder and confusion attributable to analytic ethos whereby a smorgasbord of different statistical tests are used to test identical or parallel statistical hypotheses. Herein four classic methods are used for an application with a binary class (dependent) variable and an ordered attribute (independent variable) … Continue reading ODA vs. χ2, r, and τ: Trauma Exposure in Childhood and Duration of Participation in Eating-Disorder Treatment Program

Some Machine Learning Algorithms Find Relationships Between Variables When None Exist — CTA Doesn’t

Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC Automated machine learning algorithms are widely promoted as the best approach for estimating propensity scores, because these methods detect patterns in the data which manual efforts fail to identify. If classification algorithms are indeed ideal for identifying relationships between treatment … Continue reading Some Machine Learning Algorithms Find Relationships Between Variables When None Exist — CTA Doesn’t

Weighted Optimal Markov Model of a Single Outcome: Ipsative Standardization of Ordinal Ratings is Unnecessary

Paul R. Yarnold Optimal Data Analysis, LLC This note empirically compares the use of raw vs. ipsatively standardized variables in optimal weighted Markov analysis involving a series for a single outcome—presently, ratings of sleep difficulties for an individual. Findings indicate that the raw score and ipsatively standardized ordinal ratings yield equivalent results in such designs. … Continue reading Weighted Optimal Markov Model of a Single Outcome: Ipsative Standardization of Ordinal Ratings is Unnecessary

More On: “Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia”

Paul R. Yarnold Optimal Data Analysis, LLC A recent article optimized ESS of a suboptimal classification tree model that discriminated hospitalized vs. ambulatory patients with community acquired pneumonia (CAP). This note suggests possible alternatives for two original attributes as a means of increasing model accuracy: patient disease-specific knowledge vs. “college education”, and patient-specific functional status … Continue reading More On: “Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia”

Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia

Paul R. Yarnold Optimal Data Analysis, LLC Pruning to maximize model accuracy (requiring simple hand computation) is applied to a classification tree model developed via S-PLUS to create propensity scores to improve causal inference in comparing hospitalized vs. ambulatory patients with community-acquired pneumonia. Research reported herein constitutes a thought-provoking example of a striking misalliance between … Continue reading Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia