Sparse Glider Datasets: A Case Study for NoSQL Databases
Servers, Educational institutions, Relational databases, Indexes, Marine technology
Digital Object Identifier (DOI)
Multi-sensor platforms like buoys and gliders produce one or more readings per sensor on varying, discrete time frequencies. The resulting datasets are a matricies with rows containing readings from sensors that reported at a moment in time and NULL for missing readings from sensors that did not. Traditional Relational Database Management Systems (RDBMS) are already well suited for the dense matricies in which NULL values are infrequent. The efficiency of these systems deteriorates though as data becomes more sparse. The University of South Florida College of Marine Science Ocean Technology Group (COT) operates four gliders. Each glider produces dynamic, different sparse datasets. Other data management solutions exist, but they are based on a RDBMS. COT has been investigating an alternative without using and RDBMS. Glider Database Alternative with Mongo (GDAM) is a data management system for gliders built on the MongoDB NoSQL database engine. It is live in production at COT. GDAM is a collection of scripts which parse, process and store real-time glider datasets. Data is parsed as soon as it is transmitted via satellite to our shore-based servers. The system has been tested during two Slocum G1 glider deployments in September and October of 2012. Archival datasets dating back to March of 2009 have also been uploaded into this system. Records are indexed by time, GPS, and depth with the ability to add more indexes as necessary. The paper outlines dataset problems identified using data from COT glider operations in 2012. These problems inform a discussion of design decisions and possible options considering both RDBMS and NoSQL systems. The paper concludes by discussing the current implementation of GDAM.
Was this content written or created while at USF?
Citation / Publisher Attribution
Presented at the Oceans 2013 MTS/IEEE Conference on September 23-27, 2013 in San Diego, CA
Scholar Commons Citation
Lindemuth, Michael and Lembke, Chad, "Sparse Glider Datasets: A Case Study for NoSQL Databases" (2013). Marine Science Faculty Publications. 484.