Selecting an Appropriate Weighting Strategy in Maximum-Accuracy Time-to-Event (Survival) Analysis

January 13, 2020January 13, 2020 ~ paulyarnold

Paul R. Yarnold, Nathaniel J. Rhodes, & Ariel Linden Optimal Data Analysis LLC, Chicago College of Pharmacy and the Pharmacometrics Center of Excellence at Midwestern University, & Linden Consulting Group LLC Different weighting schemes in optimal survival analysis are considered. View journal article

Randomized Blocks Designs: Omnibus vs. Pairwise Comparison, Fixed vs. Relative Optimal Discriminant Threshold, and Raw vs. Ipsative z-Score Measures

June 13, 2019 ~ paulyarnold

Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC This study extends recent research assessing the use of relative thresholds in matched-pairs designs, for a randomized blocks design in which four treatments are randomly assigned to blood samples drawn from each of eight people (each person treated as a … Continue reading Randomized Blocks Designs: Omnibus vs. Pairwise Comparison, Fixed vs. Relative Optimal Discriminant Threshold, and Raw vs. Ipsative z-Score Measures

Using Fixed and Relative Optimal Discriminant Thresholds in Randomized Blocks (Matched-Pairs) Designs

May 30, 2019May 30, 2019 ~ paulyarnold

Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC Optimal discriminant analysis (ODA) is often used to compare values of one (or more) attributes between two (or more) groups of observations with respect to a fixed discriminant threshold that maximizes accuracy normed against chance for the sample. However, a … Continue reading Using Fixed and Relative Optimal Discriminant Thresholds in Randomized Blocks (Matched-Pairs) Designs

ODA vs. t-Test: Lysozyme Levels in the Gastric Juice of Patients with Peptic Ulcer vs. Normal Controls

May 23, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC Lysozyme levels in gastric juice of peptic ulcer patients were compared against normal controls by t-test, finding p<0.05. Because standard deviations differed by a factor of two between groups, and were proportional to the means, analysis of natural logarithms was instead deemed appropriate: the resulting t-test wasn’t statistically … Continue reading ODA vs. t-Test: Lysozyme Levels in the Gastric Juice of Patients with Peptic Ulcer vs. Normal Controls

Regression vs. Novometric-Based Assessment of Inter-Examiner Reliability

May 23, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC Four examiners independently recorded the DMFS (decayed, missing, filled surfaces) scores of ten patients. Inter-examiner correspondence of DMFS scores was evaluated using Pearson correlation and novometric analysis. Whereas essentially perfect correlation models were unable to accurately predict DMFS scores in training analysis, novometric models were consistently perfect in … Continue reading Regression vs. Novometric-Based Assessment of Inter-Examiner Reliability

Fixed vs. Relative Optimal Discriminant Thresholds: Pairwise Comparisons of Raters’ Ratings for a Sample

May 19, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC Foundational to the ODA algorithm when used with an ordered attribute is the identification of the optimal threshold—the specific cutpoint that yields the most accurate (weighted) classification solution for a sample of observations. ODA models involving a single optimal threshold will henceforth be called “fixed-threshold” models. This note … Continue reading Fixed vs. Relative Optimal Discriminant Thresholds: Pairwise Comparisons of Raters’ Ratings for a Sample

Logistic Discriminant Analysis and Structural Equation Modeling Both Identify Effects in Random Data

May 16, 2019 ~ paulyarnold

Ariel Linden, Fred B. Bryant & Paul R. Yarndol Linden Consulting Group, LLC, Loyola University Chicago & Optimal Data Analysis, LLC Recent research compared the ability of various classification algorithms [logistic regression (LR), random forests (RF), support vector machines (SVM), boosted regression (BR), multi-layer perceptron neural net model (MLP), and classification tree analysis (CTA)] to … Continue reading Logistic Discriminant Analysis and Structural Equation Modeling Both Identify Effects in Random Data

Multi-Layer Perceptron Neural Net Model Identifies Effect in Random Data

April 22, 2019 ~ paulyarnold

Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC Prior research contrasted the ability of different classification algorithms [logistic regression (LR), random forests (RF), boosted regression (BR), support vector machines (SVM), classification tree analysis (CTA)] to correctly fail to identify a relationship between a binary class (dependent) variable and … Continue reading Multi-Layer Perceptron Neural Net Model Identifies Effect in Random Data

Optimizing Suboptimal Classification Trees: Matlab® CART Model Predicting Probability of Lower Limb Prosthesis User’s Functional Potential

April 14, 2019 ~ paulyarnold

Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC After any algorithm which controls the growth of a classification tree model has completed, the resulting model must be pruned in order to explicitly maximize predictive accuracy normed against chance. This article illustrates manually-conducted maximum-accuracy pruning of a classification and … Continue reading Optimizing Suboptimal Classification Trees: Matlab® CART Model Predicting Probability of Lower Limb Prosthesis User’s Functional Potential

Regression vs. Novometric Analysis Predicting Income Based on Education

April 14, 2019April 14, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC This study compares linear regression vs. novometric models of the association of education and income for a sample of 32 observations. Regression analysis identified a relatively strong effect (R-squared=56.4), but only 25% of point predictions fell within a 20% band of actual income. Novometric analysis identified a strong … Continue reading Regression vs. Novometric Analysis Predicting Income Based on Education

Effect of Sample Size on Discovery of Relationships in Random Data by Classification Algorithms

April 11, 2019April 11, 2019 ~ paulyarnold

Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC In a recent paper, we assessed the ability of several classification algorithms (logistic regression, random forests, boosted regression, support vector machines, and classification tree analysis [CTA]) to correctly not identify a relationship between the dependent variable and ten covariates generated … Continue reading Effect of Sample Size on Discovery of Relationships in Random Data by Classification Algorithms

ODA vs. χ2, r, and τ: Trauma Exposure in Childhood and Duration of Participation in Eating-Disorder Treatment Program

April 11, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC This note illustrates the disorder and confusion attributable to analytic ethos whereby a smorgasbord of different statistical tests are used to test identical or parallel statistical hypotheses. Herein four classic methods are used for an application with a binary class (dependent) variable and an ordered attribute (independent variable) … Continue reading ODA vs. χ2, r, and τ: Trauma Exposure in Childhood and Duration of Participation in Eating-Disorder Treatment Program

Novometric Stepwise CTA Analysis Discriminating Three Class Categories Using Two Ordered Attributes

April 6, 2019 ~ paulyarnold

Paul R. Yarnold & Ariel Linden Optimal Data Analysis, LLC & Linden Consulting Group, LLC The adaptability of novometric analysis is illustrated for an example involving three class categories and two ordered attributes. View journal article

Some Machine Learning Algorithms Find Relationships Between Variables When None Exist — CTA Doesn’t

March 21, 2019 ~ paulyarnold

Ariel Linden & Paul R. Yarnold Linden Consulting Group, LLC & Optimal Data Analysis, LLC Automated machine learning algorithms are widely promoted as the best approach for estimating propensity scores, because these methods detect patterns in the data which manual efforts fail to identify. If classification algorithms are indeed ideal for identifying relationships between treatment … Continue reading Some Machine Learning Algorithms Find Relationships Between Variables When None Exist — CTA Doesn’t

Optimal Markov Model Relating Two Time-Lagged Outcomes

February 18, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC This paper demonstrates the use of maximum-accuracy weighted Markov analysis to model the relationship between two time-lagged variables—serial ratings of pain during the day and subsequent quality of sleep at night—for an individual. View journal article

Weighted Optimal Markov Model of a Single Outcome: Ipsative Standardization of Ordinal Ratings is Unnecessary

February 13, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC This note empirically compares the use of raw vs. ipsatively standardized variables in optimal weighted Markov analysis involving a series for a single outcome—presently, ratings of sleep difficulties for an individual. Findings indicate that the raw score and ipsatively standardized ordinal ratings yield equivalent results in such designs. … Continue reading Weighted Optimal Markov Model of a Single Outcome: Ipsative Standardization of Ordinal Ratings is Unnecessary

More On: “Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia”

February 12, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC A recent article optimized ESS of a suboptimal classification tree model that discriminated hospitalized vs. ambulatory patients with community acquired pneumonia (CAP). This note suggests possible alternatives for two original attributes as a means of increasing model accuracy: patient disease-specific knowledge vs. “college education”, and patient-specific functional status … Continue reading More On: “Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia”

Confirming the Efficacy of Weighting in Optimal Markov Analysis: Modeling Serial Symptom Ratings

February 7, 2019February 16, 2019 ~ paulyarnold

Paul R. Yarnold & Robert C. Soltysik Optimal Data Analysis, LLC This paper uses ODA to weight each event in the transition table by its corresponding absolute change-in-value, thereby maximizing precision of the class variable as well as model accuracy. View journal article

Multiple Regression vs. Novometric Analysis of a Contingency Table

February 4, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC This paper compares findings obtained using multiple regression analysis vs. novometric analysis to identify the relationship between the type of degree earned by 8th Grade math teachers and the training which they received in cultural and cognitive student diversity methods. View journal article

Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia

January 31, 2019February 4, 2019 ~ paulyarnold

Paul R. Yarnold Optimal Data Analysis, LLC Pruning to maximize model accuracy (requiring simple hand computation) is applied to a classification tree model developed via S-PLUS to create propensity scores to improve causal inference in comparing hospitalized vs. ambulatory patients with community-acquired pneumonia. Research reported herein constitutes a thought-provoking example of a striking misalliance between … Continue reading Optimizing Suboptimal Classification Trees: S-PLUS® Propensity Score Model for Adjusted Comparison of Hospitalized vs. Ambulatory Patients with Community-Acquired Pneumonia