Nathaniel J. Rhodes Chicago College of Pharmacy, and the Pharmacometrics Center of Excellence, Midwestern University Statistical power analysis simulation results are provided for determining the “worst-case” sample size assuming minimal measurement precision and relatively weak or moderate effect strengths. In simulated trials with relatively weak effects (ESS = 24%), greater than 80% and greater than … Continue reading Statistical Power Analysis in ODA, CTA and Novometrics (Invited)
Implementing ODA from Within Stata: An Application to Estimating Treatment Effects using Observational Data (Invited)
Ariel Linden Linden Consulting Group, LLC In this paper, I demonstrate how treatment effects in observational data can be estimated for both binary and multivalued treatments using the new Stata package for implementing ODA. Matching and weighting techniques are implemented and ODA results are compared to those using conventional regression approaches. View journal article
Implementing ODA From Within Stata: An Application to Data From a Randomized Controlled Trial (Invited)
Ariel Linden Linden Consulting Group, LLC In this paper, the new Stata package for implementing ODA is introduced by reanalyzing data from a study by Linden and Butterworth (2014) that investigated the effect of a comprehensive hospital-based intervention in reducing readmissions for chronically ill patients. In the original analysis, negative binomial regression was used to … Continue reading Implementing ODA From Within Stata: An Application to Data From a Randomized Controlled Trial (Invited)
Running MegaODA and CTA Software within Stata
Dr. Ariel Linden created and published the Stata programs for running MegaODA and CTA software. To run MegaODA and CTA software within Stata refer to ODA articles discussing this topic in the article’s title.
Reformulating the First Axiom of Novometric Theory: Assessing Minimum Sample Size in Experimental Design
Paul R. Yarnold Optimal Data Analysis, LLC The first axiom of novometric theory is reformulated, and two methods for assessing the minimum required sample size in experimental design are discussed. View journal article
Selecting an Appropriate Weighting Strategy in Maximum-Accuracy Time-to-Event (Survival) Analysis
Paul R. Yarnold, Nathaniel J. Rhodes, & Ariel Linden Optimal Data Analysis LLC, Chicago College of Pharmacy and the Pharmacometrics Center of Excellence at Midwestern University, & Linden Consulting Group LLC Different weighting schemes in optimal survival analysis are considered. View journal article
Randomized Blocks Designs: Omnibus vs. Pairwise Comparison, Fixed vs. Relative Optimal Discriminant Threshold, and Raw vs. Ipsative z-Score Measures
Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC This study extends recent research assessing the use of relative thresholds in matched-pairs designs, for a randomized blocks design in which four treatments are randomly assigned to blood samples drawn from each of eight people (each person treated as a … Continue reading Randomized Blocks Designs: Omnibus vs. Pairwise Comparison, Fixed vs. Relative Optimal Discriminant Threshold, and Raw vs. Ipsative z-Score Measures
Using Fixed and Relative Optimal Discriminant Thresholds in Randomized Blocks (Matched-Pairs) Designs
Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC Optimal discriminant analysis (ODA) is often used to compare values of one (or more) attributes between two (or more) groups of observations with respect to a fixed discriminant threshold that maximizes accuracy normed against chance for the sample. However, a … Continue reading Using Fixed and Relative Optimal Discriminant Thresholds in Randomized Blocks (Matched-Pairs) Designs
ODA vs. t-Test: Lysozyme Levels in the Gastric Juice of Patients with Peptic Ulcer vs. Normal Controls
Paul R. Yarnold Optimal Data Analysis, LLC Lysozyme levels in gastric juice of peptic ulcer patients were compared against normal controls by t-test, finding p<0.05. Because standard deviations differed by a factor of two between groups, and were proportional to the means, analysis of natural logarithms was instead deemed appropriate: the resulting t-test wasn’t statistically … Continue reading ODA vs. t-Test: Lysozyme Levels in the Gastric Juice of Patients with Peptic Ulcer vs. Normal Controls
Regression vs. Novometric-Based Assessment of Inter-Examiner Reliability
Paul R. Yarnold Optimal Data Analysis, LLC Four examiners independently recorded the DMFS (decayed, missing, filled surfaces) scores of ten patients. Inter-examiner correspondence of DMFS scores was evaluated using Pearson correlation and novometric analysis. Whereas essentially perfect correlation models were unable to accurately predict DMFS scores in training analysis, novometric models were consistently perfect in … Continue reading Regression vs. Novometric-Based Assessment of Inter-Examiner Reliability
Fixed vs. Relative Optimal Discriminant Thresholds: Pairwise Comparisons of Raters’ Ratings for a Sample
Paul R. Yarnold Optimal Data Analysis, LLC Foundational to the ODA algorithm when used with an ordered attribute is the identification of the optimal threshold—the specific cutpoint that yields the most accurate (weighted) classification solution for a sample of observations. ODA models involving a single optimal threshold will henceforth be called “fixed-threshold” models. This note … Continue reading Fixed vs. Relative Optimal Discriminant Thresholds: Pairwise Comparisons of Raters’ Ratings for a Sample
Logistic Discriminant Analysis and Structural Equation Modeling Both Identify Effects in Random Data
Ariel Linden, Fred B. Bryant & Paul R. Yarndol Linden Consulting Group, LLC, Loyola University Chicago & Optimal Data Analysis, LLC Recent research compared the ability of various classification algorithms [logistic regression (LR), random forests (RF), support vector machines (SVM), boosted regression (BR), multi-layer perceptron neural net model (MLP), and classification tree analysis (CTA)] to … Continue reading Logistic Discriminant Analysis and Structural Equation Modeling Both Identify Effects in Random Data
Multi-Layer Perceptron Neural Net Model Identifies Effect in Random Data
Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC Prior research contrasted the ability of different classification algorithms [logistic regression (LR), random forests (RF), boosted regression (BR), support vector machines (SVM), classification tree analysis (CTA)] to correctly fail to identify a relationship between a binary class (dependent) variable and … Continue reading Multi-Layer Perceptron Neural Net Model Identifies Effect in Random Data
Optimizing Suboptimal Classification Trees: Matlab® CART Model Predicting Probability of Lower Limb Prosthesis User’s Functional Potential
Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC After any algorithm which controls the growth of a classification tree model has completed, the resulting model must be pruned in order to explicitly maximize predictive accuracy normed against chance. This article illustrates manually-conducted maximum-accuracy pruning of a classification and … Continue reading Optimizing Suboptimal Classification Trees: Matlab® CART Model Predicting Probability of Lower Limb Prosthesis User’s Functional Potential
Regression vs. Novometric Analysis Predicting Income Based on Education
Paul R. Yarnold Optimal Data Analysis, LLC This study compares linear regression vs. novometric models of the association of education and income for a sample of 32 observations. Regression analysis identified a relatively strong effect (R-squared=56.4), but only 25% of point predictions fell within a 20% band of actual income. Novometric analysis identified a strong … Continue reading Regression vs. Novometric Analysis Predicting Income Based on Education
Effect of Sample Size on Discovery of Relationships in Random Data by Classification Algorithms
Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC In a recent paper, we assessed the ability of several classification algorithms (logistic regression, random forests, boosted regression, support vector machines, and classification tree analysis [CTA]) to correctly not identify a relationship between the dependent variable and ten covariates generated … Continue reading Effect of Sample Size on Discovery of Relationships in Random Data by Classification Algorithms
ODA vs. χ2, r, and τ: Trauma Exposure in Childhood and Duration of Participation in Eating-Disorder Treatment Program
Paul R. Yarnold Optimal Data Analysis, LLC This note illustrates the disorder and confusion attributable to analytic ethos whereby a smorgasbord of different statistical tests are used to test identical or parallel statistical hypotheses. Herein four classic methods are used for an application with a binary class (dependent) variable and an ordered attribute (independent variable) … Continue reading ODA vs. χ2, r, and τ: Trauma Exposure in Childhood and Duration of Participation in Eating-Disorder Treatment Program
Novometric Stepwise CTA Analysis Discriminating Three Class Categories Using Two Ordered Attributes
Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC The adaptability of novometric analysis is illustrated for an example involving three class categories and two ordered attributes. View journal article
Some Machine Learning Algorithms Find Relationships Between Variables When None Exist — CTA Doesn’t
Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC Automated machine learning algorithms are widely promoted as the best approach for estimating propensity scores, because these methods detect patterns in the data which manual efforts fail to identify. If classification algorithms are indeed ideal for identifying relationships between treatment … Continue reading Some Machine Learning Algorithms Find Relationships Between Variables When None Exist — CTA Doesn’t
Optimal Markov Model Relating Two Time-Lagged Outcomes
Paul R. Yarnold Optimal Data Analysis, LLC This paper demonstrates the use of maximum-accuracy weighted Markov analysis to model the relationship between two time-lagged variables—serial ratings of pain during the day and subsequent quality of sleep at night—for an individual. View journal article