Graduation Year

2016

Document Type

Thesis

Degree

M.S.C.S.

Degree Name

MS in Computer Science (M.S.C.S.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Yicheng Tu, Ph.D.

Committee Member

Sagar Pandit, Ph.D.

Committee Member

Yan Zhang, Ph.D.

Keywords

Scientific Database, Molecular Dynamics, MapReduce, Primary Queries, Analytical Queries

Abstract

Huge amount of data is being generated in almost every field and it cannot be avoided, rather is essential for the advancement of the field. Analysis of this data requires intensive computing power. Molecular Simulation is a powerful tool for understanding the behavior of natural systems. The simulation generates large amount data while observing the spatial and temporal relationships. The challenge is to handle the analytical queries that are often compute intensive.

Although various tools exist to tackle this problem, but in this paper we have tried an alternate approach that uses Apache Spark- a modern big data platform – to parallelize the computation of analytical queries. MsSpark consists of three layers: Apache Spark layer, MS RDD layer and MS Query Processing layer. MS RDD layers supports data that is specific to Molecular Simulation. MS Query Processing layer provides functionality of executing analytical queries. Caching is used to improve the performance. The system can be further extended to cover more analytical queries.

Scholar Commons Citation

Kaur, Parneet, "MsSpark: Implementation of Molecular Simulation Queries Using Apache Spark" (2016). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/6272

Download

Included in

Computer Sciences Commons

COinS

USF Tampa Graduate Theses and Dissertations

MsSpark: Implementation of Molecular Simulation Queries Using Apache Spark

Graduation Year

Document Type

Degree

Degree Name

Degree Granting Department

Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Search

Browse By

Useful Links

USF Tampa Graduate Theses and Dissertations

MsSpark: Implementation of Molecular Simulation Queries Using Apache Spark

Author

Graduation Year

Document Type

Degree

Degree Name

Degree Granting Department

Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Share

Search

Browse By

Useful Links