Graduation Year
2018
Document Type
Thesis
Degree
M.S.C.S.
Degree Name
MS in Computer Science (M.S.C.S.)
Degree Granting Department
Computer Science and Engineering
Major Professor
Jay Ligatti, Ph.D.
Committee Member
Hao Zheng, Ph.D.
Committee Member
Yicheng Tu, Ph.D.
Keywords
Sparse Data Storage, Entity Attribute Value Data Model, Database Modeling, Wide Tables, Clinical Study Data
Abstract
Clinical study data is usually collected without knowing what kind of data is going to be collected in advance. In addition, all of the possible data points that can apply to a patient in any given clinical study is almost always a superset of the data points that are actually recorded for a given patient. As a result of this, clinical data resembles a set of sparse data with an evolving data schema. To help researchers at the Moffitt Cancer Center better manage clinical data, a tool was developed called GURU that uses the Entity Attribute Value model to handle sparse data and allow users to manage a database entity’s attributes without any changes to the database table definition. The Entity Attribute Value model’s read performance gets faster as the data gets sparser but it was observed to perform many times worse than a wide table if the attribute count is not sufficiently large. Ultimately, the design trades read performance for flexibility in the data schema.
Scholar Commons Citation
Quintero, Michael C., "Constructing a Clinical Research Data Management System" (2017). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/7081