Graduation Year
2016
Document Type
Thesis
Degree
M.S.C.S.
Degree Name
MS in Computer Science (M.S.C.S.)
Degree Granting Department
Computer Science and Engineering
Major Professor
Lawrence O. Hall, Ph.D.
Committee Member
Dmitry Goldgof, Ph.D.
Committee Member
Rangachar Kasturi, Ph.D.
Keywords
Evidential Clustering, Single Pass Fuzzy C-Means
Abstract
Clustering large data sets has become very important as the amount of available unlabeled data increases. Single Pass Fuzzy C-Means (SPFCM) is useful when memory is too limited to load the whole data set. The main idea is to divide dataset into several chunks and to apply fuzzy c-means (FCM) to each chunk. SPFCM uses the weighted cluster centers of the previous chunk in the next data chunks. However, when the number of chunks is increased, the algorithm shows sensitivity to the order in which the data is processed. Hence, we improved SPFCM by recognizing boundary and noisy data in each chunk and using it to influence clustering in the next chunks. The proposed approach transfers the boundary and noisy data as well as the weighted cluster centers to the next chunks. We show that our proposed approach is significantly less sensitive to the order in which the data is loaded in each chunk.
Scholar Commons Citation
Chakeri, Alireza, "Scalable Clustering Using the Dempster-Shafer Theory of Evidence" (2016). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/6478