Graduation Year
2013
Document Type
Dissertation
Degree
Ph.D.
Degree Granting Department
Engineering Computer Science
Major Professor
Yi-Cheng Tu
Keywords
Big Data, Compression, Edit Automata, GPU Computing, Molecular Simulations, Parallel Processing
Abstract
Large amount of data is generated by applications used in basic-science research and development applications. The size of data introduces great challenges in storage, analysis and preserving privacy. This dissertation proposes novel techniques to efficiently analyze the data and reduce storage space requirements through a data compression technique while preserving privacy and providing data security.
We present an efficient technique to compute an analytical query called spatial distance histogram (SDH) using spatiotemporal properties of the data. Special spatiotemporal properties present in the data are exploited to process SDH efficiently on the fly. General purpose graphics processing units (GPGPU or just GPU) are employed to further boost the performance of the algorithm.
Size of the data generated in scientific applications poses problems of disk space requirements, input/output (I/O) delays and data transfer bandwidth requirements. These problems are addressed by applying proposed compression technique. We also address the issue of preserving privacy and security in scientific data by proposing a security model. The security model monitors user queries input to the database that stores and manages scientific data. Outputs of user queries are also inspected to detect privacy breach. Privacy policies are enforced by the monitor to allow only those queries and results that satisfy data owner specified policies.
Scholar Commons Citation
Kumar, Anand, "Efficient and Private Processing of Analytical Queries in Scientific Datasets" (2013). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/4822