USF Tampa Graduate Theses and Dissertations

A Study of Machine Learning Performance in the Prediction of Juvenile Diabetes from Clinical Test Results

Shibendra Pobi, University of South Florida

Graduation Year

2006

Document Type

Thesis

Degree

M.A.

Degree Granting Department

Computer Science

Major Professor

Lawrence O. Hall, Ph.D.

Committee Member

Dmitry Goldgof, Ph.D.

Committee Member

Rangachar Kasturi, Ph.D.

Keywords

Ensembles, Decision trees, Neural networks, Diabetes prediction, Over-sampling

Abstract

Two approaches to building models for prediction of the onset of Type 1 diabetes mellitus in juvenile subjects were examined. A set of tests performed immediately before diagnosis was used to build classifiers to predict whether the subject would be diagnosed with juvenile diabetes. A modified training set consisting of differences between test results taken at different times was also used to build classifiers to predict whether a subject would be diagnosed with juvenile diabetes. Neural networks were compared with decision trees and ensembles of both types of classifiers. Support Vector Machines were also tested on this dataset. The highest known predictive accuracy was obtained when the data was encoded to explicitly indicate missing attributes in both cases. In the latter case, high accuracy was achieved without test results which, by themselves, could indicate diabetes. The effects of oversampling of minority class samples in the training set by generating synthetic examples were tested with ensemble techniques like bagging and random forests. It was observed, that oversampling of diabetic examples, lead to an increased accuracy in diabetic prediction demonstrated by a significantly better F-measure value. ROC curves and the statistical F-measure were used to compare the performance of the different machine learning algorithms.

Scholar Commons Citation

Pobi, Shibendra, "A Study of Machine Learning Performance in the Prediction of Juvenile Diabetes from Clinical Test Results" (2006). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/2661

Download

Included in

American Studies Commons

COinS

USF Tampa Graduate Theses and Dissertations

A Study of Machine Learning Performance in the Prediction of Juvenile Diabetes from Clinical Test Results

Graduation Year

Document Type

Degree

Degree Granting Department

Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Search

Browse By

Useful Links

USF Tampa Graduate Theses and Dissertations

A Study of Machine Learning Performance in the Prediction of Juvenile Diabetes from Clinical Test Results

Author

Graduation Year

Document Type

Degree

Degree Granting Department

Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Share

Search

Browse By

Useful Links