Graduation Year
2015
Document Type
Thesis
Degree
M.S.C.S
Degree Name
MS in Computer Science (M.S.C.S.)
Department
Computer Science
Degree Granting Department
Computer Science and Engineering
Major Professor
Yicheng Tu, Ph.D.
Committee Member
Sagar Pandit, Ph.D.
Committee Member
Feng Cheng, Ph.D.
Keywords
Scientific database, Molecular Dynamics, Big Data, Quadtree, SP-GIST
Abstract
Despite the fact that Molecular Simulation systems represent a major research tool in multiple scientific and engineering fields, there is still a lack of systems for effective data management and fast data retrieval and processing. This is mainly due to the nature of MS which generate a very large amount of data - a system usually encompass millions of data information, and one query usually runs for tens of thousands of time frames. For this purpose, we designed and developed a new application, DCMS (A data Analytics and Management System for molecular Simulation), that intends to speed up the process of new discovery in the medical/physics fields.
DCMS stores simulation data in a database; and provides users with a user-friendly interface to upload, retrieve, query, and analyze MS data without having to deal with any raw data. In addition, we also created a new indexing scheme, the Time-Parameterized Spatial (TPS) tree, to accelerate query processing through indexes that take advantage of the locality relationships between atoms. The tree was implemented directly inside the PostgreSQL kernel, on top of the SP-GiST platform. Along with this new tree, two new data types were also defined, as well as new algorithms for five data points' retrieval queries.
Scholar Commons Citation
Berrada, Meryem, "DCMS: A Data Analytics and Management System for Molecular Simulation" (2015). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/5453