Graduation Year
2023
Document Type
Dissertation
Degree
Ph.D.
Degree Name
Doctor of Philosophy (Ph.D.)
Degree Granting Department
Public Health
Major Professor
Janice Zgibor, RPh, Ph.D.
Co-Major Professor
Jason Beckstead, Ph.D.
Committee Member
Wei Wang, Ph.D.
Committee Member
Henian Chen, MD, Ph.D.
Keywords
Clustering Imputation, K-Means, Latent Class Imputation, Missing Data, Multiple Imputation, K-Modes, Non-Parametric Impucation
Abstract
Research has a variety of difficulties, especially when involving human subjects, and one of the most prevalent is the issue of missing data. Missing data will always be present in research due to the fact there is no perfect method for collecting data and protecting against human error or mechanical failure. This requires researchers to be able to mitigate the problems that come along with missing data; reduction in power of an analysis and bias introduced by the missing pattern. This research investigated a non-parametric method using a nested approach of fuzzy K-Modes and fuzzy C-Means clustering to impute missing data in an effort to reduce the issues introduced by the most severe type of missing data, missing not at random. This method was compared to complete case analysis and Latent Class Imputation. The results of the simulation showed that the proposed method did not sufficiently remove the bias imposed by the missing not at random pattern and could not successfully detect statistically significant regression coefficients. The method showed better results working with continuous variables and proved to be a more efficient estimation method than Latent Class Imputation, by having lower standard errors of the estimates in every scenario. Latent Class Imputation when the priors are not correctly specified also failed to sufficiently mitigate the issues of the missing not at random pattern. The proposed method being more efficient than Latent Class Imputation and requiring no outside assistance beyond refinement of the algorithm holds promise that with further tuning will successfully mitigate the problems of data that is missing not at random.
Scholar Commons Citation
Malmi, Markku A. Jr., "Fuzzy KC Clustering Imputation for Missing Not At Random Data" (2023). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/9901