Graduation Year
2019
Document Type
Thesis
Degree
M.A.
Degree Name
Master of Arts (M.A.)
Degree Granting Department
Mathematics and Statistics
Major Professor
Lu Lu, Ph.D.
Committee Member
Mingyang Li, Ph.D.
Committee Member
Seung-Yeop Lee, Ph.D.
Keywords
Bootstrapping, Classification, Decision Trees, Machine Learning, Statistics
Abstract
Ensemble methods are commonly used for building predictive models for classification. Models that are unstable to perturbations in the training set, such as the decision tree, often see considerable reductions in error when grouped, using bootstrapped resamples of the training data to train many models. The non-parametric bootstrap, however, has limited efficacy when used on severely imbalanced data, especially when the number of observations of one or more classes is exceptionally small. We explore the fractional random weighted bootstrap, which randomly assigns fractional weights to observations, as an alternative resampling pro cedure in training machine learning ensembles, particularly decision tree ensembles. We carry out a methodological study comparing the standard bagging and random forest ensemble models for decision trees against their fractionally random weighted alternatives, finding some evidence supporting their use on data with severe imbal ance.
Scholar Commons Citation
Carter, Sean Charles, "Fractional Random Weighted Bootstrapping for Classification on Imbalanced Data with Ensemble Decision Tree Methods" (2019). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/8012