Distance from a Theoretically Ideal Statistical Classification Model Defined as the Number of Additional Equivalent Effects Needed to Obtain Perfect Classification for the Sample

Paul R. Yarnold

Optimal Data Analysis, LLC

A method for computing the distance between an empirically-derived statistical classification model and a corresponding theoretically ideal classification model is described. Use of the distance index to identify and to compare globally optimal classification models, within and between descendent families, is illustrated with an example using ethnicity to parse the incidence of different types of cancer.

