Statistical Evaluation of the Findings of Qualitative Comparative Analysis

Paul R. Yarnold

Optimal Data Analysis, LLC

Qualitative data analysis is a structured observational and clustering methodology which facilitates hypothesis development and variable generation for quantitative research, fruitfully employed in agriculture, anthropology, astronomy, biology, forensic investigation, education, history, marketing, medicine, political science, psychology, sociology, and zoology, to name a handful of diciplines. The method known as Qualitative Comparative Analysis (QCA) is adept in producing evidence in complex policy problems involving interdependencies among multiple causes. Recent research used QCA to study factors underlying high rates of teenage conceptions in high-risk areas in England. Nine binary attributes reflecting five different “variable constellations” were identified. Variable constellations are putatively associated with areas with teenage conception rates which are narrowing versus not narrowing (the outcome or class variable) with respect to the national average. This article discusses use of UniODA and CTA to ascertain which attributes are statistically reliable predictors of outcome.

View journal article

Exploratory Analysis for an Ordered Series of a Dichotomous Attribute: Airborne Radiation and Congenital Hypothyroidism of California Newborns

Paul R. Yarnold & Robert C. Soltysik

Optimal Data Analysis, LLC

Confirmatory hypothesis-testing methodology was recently demonstrated with an example assessing the effect of airborne beta nuclear radiation emanating from the Fukushima nuclear meltdown on the risk of confirmed congenital hypothyroidism (CH) for newborns in California in the years 2011-2012. Eyeball inspection of the data suggests that the a priori hypothesis which was evaluated is inconsistent with the actual data, so an exploratory analysis is conducted.

View journal article

Confirmatory Analysis for an Ordered Series of a Dichotomous Attribute: Airborne Radiation and Congenital Hypothyroidism of California Newborns

Paul R. Yarnold & Robert C. Soltysik

Optimal Data Analysis, LLC

Ordered series involving a dichotomous (binary) variable are widely used to describe changes in phenomena which occur across time. Examples of such series include the percentage of a sample or population each year (or other unit of time) that marries, dies, or is arrested. This article demonstrates how UniODA is used for such designs to test a confirmatory (a priori), omnibus (overall), optimal (maximum-accuracy) hypothesis, subsequently disentangled by a confirmatory (if hypotheses are composed) or otherwise by an exploratory (post hoc) optimal range test. This methodology is demonstrated with an example assessing the effect of airborne beta nuclear radiation emanating from the Fukushima nuclear meltdown on the risk of congenital hypothyroidism (CH) for newborns in California in the years 2011-2012.

View journal article

MegaODA Large Sample and BIG DATA Time Trials: Maximum Velocity Analysis

Paul R. Yarnold & Robert C. Soltysik

Optimal Data Analysis, LLC

This third time trial of newly-released MegaODA™ software studies the fastest-to-analyze application, known as a 2×2 cross-classification table. Designs involving unweighted binary data are arguably currently the most widely employed across quantitative scientific disciplines as well as engineering fields including communications, graphics, data compression, real-time processing and autonomous synthetic decision-making, among others. The present simulation research is run on a 3 GHz Intel Pentium D microcomputer and reveals MegaODA returns the exact one- or two-tailed Type I error rate, as well as all of the other classification-relevant statistics provided in UniODA analysis, in fractions of a CPU second for samples of a million observations.

View journal article

Determining When Annual Crude Mortality Rate Most Recently Began Increasing in North Dakota Counties, I: Backward-Stepping Little Jiffy

Paul R. Yarnold

Optimal Data Analysis, LLC

Recent research tested the hypothesis that the annual crude mortality rate (ACMR) was higher after versus before 1998 in counties of North Dakota, due to increased exposure of the population to environmental toxins and hazards beginning approximately at that time. This hypothesis was confirmed with experimentwise p<0.05 for 16 counties. This article investigates the ACMR time series for each of these counties using a backward-stepping little jiffy UniODA analysis to ascertain precisely when ACMR began to increase. As hypothesized the ACMR began increasing in Bowman and Kidder counties precisely in 1998. Consistent with the a priori hypothesis the initial (and presently sustained) increase in ACMR occurred in McLean county in 1997, and in Foster county in 1996. Significant sustained increases in ACMR initially began in Stark county in 1993, and in Burleigh county in 1988. The UniODA models identified hypothesized, recent, powerful, sustained, statistically significant increases in ACMR.

View journal article

Surfing the Index of Consumer Sentiment: Identifying Statistically Significant Monthly and Yearly Changes

Paul R. Yarnold

Optimal Data Analysis, LLC

Published monthly by the Survey Research Center of the University of Michigan, the Index of Consumer Sentiment (ICS) is widely followed, and one of its factors (the Index of Consumer Expectations) is used in the Leading Indicator Composite Index published by the US Department of Commerce, Bureau of Economic Analysis. Using household telephone interviews the ICS provides an empirical measure of near-term consumer attitudes on business climate, and personal finance and spending. Variation in ICS influences price and volume in currency, bond, and equity markets in the US and in markets globally. The practice of releasing monthly ICS values five minutes to two seconds earlier for elite customers via high-speed communication channels was recently suspended because it provided unfair trading advantages. This article investigates the trajectory of the ICS over the most recent three-years, evaluating the statistical significance of month-over-month and year-over-year changes. These analyses define a longitudinal series of class variables which may be modeled temporally using time-lagged single- (UniODA) and multiple- (CTA) attribute ODA methods.

View journal article