Graduation Year

2023

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Public Health

Major Professor

Janice Zgibor, RPh, Ph.D.

Co-Major Professor

Jason Beckstead, Ph.D.

Committee Member

Wei Wang, Ph.D.

Committee Member

Henian Chen, MD, Ph.D.

Keywords

Clustering Imputation, K-Means, Latent Class Imputation, Missing Data, Multiple Imputation, K-Modes, Non-Parametric Impucation

Abstract

Research has a variety of difficulties, especially when involving human subjects, and one of the most prevalent is the issue of missing data. Missing data will always be present in research due to the fact there is no perfect method for collecting data and protecting against human error or mechanical failure. This requires researchers to be able to mitigate the problems that come along with missing data; reduction in power of an analysis and bias introduced by the missing pattern. This research investigated a non-parametric method using a nested approach of fuzzy K-Modes and fuzzy C-Means clustering to impute missing data in an effort to reduce the issues introduced by the most severe type of missing data, missing not at random. This method was compared to complete case analysis and Latent Class Imputation. The results of the simulation showed that the proposed method did not sufficiently remove the bias imposed by the missing not at random pattern and could not successfully detect statistically significant regression coefficients. The method showed better results working with continuous variables and proved to be a more efficient estimation method than Latent Class Imputation, by having lower standard errors of the estimates in every scenario. Latent Class Imputation when the priors are not correctly specified also failed to sufficiently mitigate the issues of the missing not at random pattern. The proposed method being more efficient than Latent Class Imputation and requiring no outside assistance beyond refinement of the algorithm holds promise that with further tuning will successfully mitigate the problems of data that is missing not at random.

Scholar Commons Citation

Malmi, Markku A. Jr., "Fuzzy KC Clustering Imputation for Missing Not At Random Data" (2023). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/9901

Download

Included in

Biostatistics Commons

COinS

USF Tampa Graduate Theses and Dissertations

Fuzzy KC Clustering Imputation for Missing Not At Random Data

Graduation Year

Document Type

Degree

Degree Name

Degree Granting Department

Major Professor

Co-Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Search

Browse By

Useful Links

USF Tampa Graduate Theses and Dissertations

Fuzzy KC Clustering Imputation for Missing Not At Random Data

Author

Graduation Year

Document Type

Degree

Degree Name

Degree Granting Department

Major Professor

Co-Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Share

Search

Browse By

Useful Links