Graduation Year
2019
Document Type
Dissertation
Degree
Ph.D.
Degree Name
Doctor of Philosophy (Ph.D.)
Degree Granting Department
Mathematics and Statistics
Major Professor
Chris P. Tsokos, Ph.D.
Committee Member
Khandethody M. Ramachandran, Ph.D.
Committee Member
Lu Lu, Ph.D.
Committee Member
Getachew A. Dagne, Ph.D.
Keywords
CPI, DIS, F8, Hemophilia, K - means, Inhibitor, Severity, Diabetes, Prediabetes, Forest, SVM
Abstract
Parametric analysis of any real-world data is the most powerful tool to characterize the probabilistic behavior in social, economic, medical, epidemiological, and other areas of study. In the present study, we identify the theoretical Probability Distribution Function(PDF) for Democracy Index Scores (DIS) from the Economist Intelligence Unit (EIU) database and estimate the maximum likelihood estimates of the theoretical PDFS. We also identify the individual PDFs for each of the clusters, Full Democracy, Flawed Democracy, Hybrid Regime, and Authoritarian Regime defined by the Economist Intelligence Unit (EIU).
A statistical model is a convenient instrument to predict the future value of any real phenomenon. In addition to identifying probability distributions, we predict the DIS for 167 countries of the world through a regression model with a high degree of accuracy. Then we do cluster analysis through (K − means) clustering algorithm based on the DIS predicted by the corresponding statistical model we have developed.
By extracting Corruption Perception Index (CPI) and World Governance Index (WGI) from Transparency International (TI) and World Bank (WB) databases respectively, we estimate a theoretical PDF of CPI for 175 countries of the world. Moreover, we estimate individual PDFs for each of the clusters - Highly Corrupted, Moderately Corrupted, Fairly Corrupted, and Least Corrupted countries of the world.
We conducted statistical analyses on Hemophilia A based on the data retrieved from Centers for Disease Control and Prevention (CDC) CHAMP F 8 surveillance program to identify the risk factors involved in Severity level of Hemophilia A. We have identified a statistical model for probability prediction of the Severity level of Hemophilia A.
Finally, we study some standard machine learning algorithms to compare and identify the best algorithm to classify and predict the correct state of a prediabetes condition in individuals. For this present study, the data was extracted from the National Health and Nutrition Examination Surveys (NHANES), part of the Centers for Disease Control and Prevention (CDC). We compare the identified champion algorithm to the existing machine learning algorithms suggested by some researchers in other countries of the world.
Scholar Commons Citation
Bashar, A. K. M. Raquibul, "Probabilistic Modeling of Democracy, Corruption, Hemophilia A and Prediabetes Data" (2019). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/8007