Date of Award
2017
Embargo Period
8-1-2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Public Health Sciences
College
College of Graduate Studies
First Advisor
Valerie Durkalski
Second Advisor
Bethany Wolf
Third Advisor
Dongjun Chung
Fourth Advisor
Constantine Karvellas
Fifth Advisor
David Koch
Abstract
Clustered binary outcomes are frequently encountered in medical research (e.g. longitudinal studies). Generalized linear mixed models (GLMMs) typically employed for clustered endpoints have challenges for some scenarios (e.g. high dimensional data). In the first dissertation aim, we develop an alternative, data-driven method called Binary Mixed Model (BiMM) tree, which combines decision tree and GLMM. We propose a procedure akin to the expectation maximization algorithm, which iterates between developing a classification and regression tree using all predictors and developing a GLMM which includes indicator variables for terminal nodes from the tree as predictors along with a random effect for the clustering variable. Since prediction accuracy may be increased through ensemble methods, we extend BiMM tree methodology within the random forest setting in the second dissertation aim. BiMM forest combines random forest and GLMM within a unified framework using an algorithmic procedure which iterates between developing a random forest and using the predicted probabilities of observations from the random forest within a GLMM that contains a random effect for the clustering variable. Simulation studies show that BiMM tree and BiMM forest methodology offer similar or superior prediction accuracy compared to standard classification and regression tree, random forest and GLMM for clustered binary outcomes. The new BiMM methods are used to develop prediction models within the acute liver failure setting using the first seven days of hospital data for the third dissertation aim. Acute liver failure is a rare and devastating condition characterized by rapid onset of severe liver damage. The majority of prediction models developed for acute liver failure patients use admission data only, even though many clinical and laboratory variables are collected daily. The novel BiMM tree and forest methodology developed in this dissertation can be used in diverse research settings to provide highly accurate and efficient prediction models for clustered and longitudinal binary outcomes.
Recommended Citation
Speiser, Jaime Lynn, "Decision Tree and Random Forest Methodology for Clustered and Longitudinal Binary Outcomes" (2017). MUSC Theses and Dissertations. 381.
https://medica-musc.researchcommons.org/theses/381
Rights
All rights reserved. Copyright is held by the author.