Graduation Year
2007
Document Type
Thesis
Degree
M.S.C.S.
Degree Granting Department
Computer Science
Major Professor
Nagarajan Ranganathan, Ph.D.
Committee Member
Srinivas Katkoori, Ph.D.
Committee Member
Soontae Kim, Ph.D.
Keywords
Hierarchical clustering, Pattern recognition, Parallel algorithms, Pattern matrix, Proximity matrix
Abstract
Clustering techniques play an important role in exploratory pattern analysis, unsupervised pattern recognition and image segmentation applications. Clustering algorithms are computationally intensive in nature. This thesis proposes new parallel algorithms for Single Link and Complete Link hierarchical clustering. The parallel algorithms have been mapped on a SIMD machine model with a linear interconnection network. The model consists of a linear array of N (number of patterns to be clustered) processing elements (PEs), interfaced to a host machine and the interconnection network provides inter-PE and PE-to-host/host-to-PE communication. For single link clustering, each PE maintains a sorted list of its first logN nearest neighbors and the host maintains a heap of the root elements of all the PEs. The determination of the smallest entry in the distance matrix and update of the distance matrix is achieved in O(logN) time. In the case of complete link clustering, each PE maintains a heap data structure of the inter pattern distances. This significantly reduces the computation time for the determination of the smallest entry in the distance matrix during each iteration, from O(N2) to O(N), as the root element in each PE gives its nearest neighbor. The proposed algorithms are faster and simpler than previously known algorithms for hierarchical clustering. For clustering a data set with N patterns, using N PEs, the computation time for the single link clustering algorithm is shown to be O(NlogN) and the time complexity for the complete link clustering algorithm is shown to be O(N2). The parallel algorithms have been verified through simulations on the Intel iPSC/2 parallel machine.
Scholar Commons Citation
Arumugavelu, Shankar, "SIMD Algorithms for Single Link and Complete Link Pattern Clustering" (2007). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/609