Graphic-Centric, Computationally-Efficient Recursive Partitioning
James Vivian, (Golden Helix, Inc., Bozeman, MT), firstname.lastname@example.org,
S. Stan Young, (CGStat LLC, Raleigh, NC), email@example.com, and
Christophe Lambert, (Golden Helix, Inc., Bozeman, MT), firstname.lastname@example.org
FIRM (Formal Inference-based Recursive Modeling) is a hypothesis testing-based, multi-way splitting version of recursive partitioning that offers significan advantages in computational speed and intrepretability. We present a GUI incarnation of FIRM which is interactive for modestly large data sets, 10k observations and 1k descriptors, and provides decision-assisting graphics. There are applications, one customized for the analysis of high-throughput screening data in drug discovery, and the other for pharmacogenetics analysis, i.e. linking patient drug-response to genotype/phenotype information. The visual nature of the recursive partitioning 'trees' and the usual availability of alternative split variable both point to the need for interactive decision-making for tree building. The chemistry software builds chemically intuitive descriptors and marks important split features in molecules. The pharmacogentics application builds in complex statistical genetics methods: linkage disequilibrium analysis, visualization of e.g. Hardy-Weinberg equilibrium, and a tractable (and ruthlessly efficient) method to impute haplotypes. Using examples from public datasets, we will demonstrate the power, efficiency, and intuitive appeal of a GUI recursive-partitioning to unravel complex patterns and relationships in large datasets leading to the transformation of data to information. We will emphasize the integration of graphics into a user-centric recursive partitioning analysis.