Doctor of Philosophy (Ph.D.)
Degree Granting Department
Ismail Uysal, Ph.D.
Nasir Ghani, Ph.D.
Yasin Yilmaz, Ph.D.
Alessio Gaspar, Ph.D.
Mohammed Elmusrati, Ph.D.
Class2Vec, Data Compression, Sense2Vec, Sequence Modeling
Processing multivariate sensory time-series with variable lengths is a challenging problem across different application domains due to the naturally complex, high-dimensional, and often non-stationary nature of the data. There are many practical examples of this in the industry particularly for the applications of sensor networks in monitoring production or distribution of goods around the globe. This thesis tackles the specific problem of time-series data representation in how we can better summarize and visualize multi-variate time series data coming from numerous sources in a sensor network distributed across a wide range of application scenarios. On one hand, we think the analysis and processing of multivariate and heterogeneous time-series data is important for predictive tasks like regression and classification. On the other hand, if we build novel systems and methods for data summarization and visualization, they would be crucial components in gathering actionable and robust insight while ensuring accurate analytics down the information chain. This thesis consists of two main parts in its contributions:
- In the first part, we aim to cover the statistical and temporal analysis of a novel multivariate time-series data for food engineering, and present our effort in proposing a new approach (Sense2Vec) for processing variable-length sensory time-series data that leverages various similarity metrics while being robust to noise and outliers. We believe that the proposed representation holds a great promise for future time-series visualization technologies.
- In the second part of this thesis, we introduce a supervised Class2Vec algorithm as an application of Sense2Vec where each class is represented by a single blueprint profile learned from the training dataset. Class2Vec uses dynamic time warping distances between different observations and the class blueprints to create a novel classification framework with an unprecedented compression of the time series training data. We evaluate this framework thoroughly on 70 different datasets hosted on the UCR Archive.
Our results clearly demonstrate that while Sense2Vec provides a novel and compressed form of data representation for time series data on multi sensor applications, Class2Vec incorporates meaningful data knowledge in generating blueprint class labels when compared to a baseline algorithm for a domain-agnostic approach.
Scholar Commons Citation
Abdella, Alla, "A Method for Compact Representation of Heterogenous and Multivariate Time Series for Robust Classification and Visualization" (2021). USF Tampa Graduate Theses and Dissertations.