Graduation Year

2022

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Electrical Engineering

Major Professor

Ismail Uysal, Ph.D.

Committee Member

Nasir Ghani, Ph.D.

Committee Member

Mia Naeini, Ph.D.

Committee Member

Robert Karam, Ph.D.

Committee Member

Deniz Dayicioglu, Ph.D.

Keywords

Data Augmentation, Healthcare, Spectrotemporal Representation, Subject-specific Learning, Unsupervised Feature Learning

Abstract

A significant gap exists in our knowledge of how domain-specific feature extraction compares to unsupervised feature learning in the latent space of a deep neural network for a range of temporal applications including human activity recognition. This dissertation aims to address this gap specifically for human activity recognition using acceleration data. To ensure reproducibility, we use two publicly available datasets, UniMiB-SHAR and ExtraSensory, with a well-established history in the human activity recognition literature. We methodically analyze the performance of 64 different combinations of i) learning representations (in the form of raw temporal data or extracted features), ii) traditional and modern classifiers with different topologies on iii) both binary (fall detection) and multi-class (daily activities of living) datasets. We report and discuss our findings and conclude that while feature engineering may still be competitive for activity recognition task, trainable front-ends of modern deep learning algorithms can benefit from raw temporal data especially in large quantities. In fact, this study claims state-of-the-art where we significantly outperform the most recent literature on UniMiB-SHAR dataset in both activity recognition (88.41% vs. 98.02%) and fall detection (98.71% vs. 99.82%) using raw temporal input.

We further improve the generalization capability of deep learning networks by introducing a richer way to harness the spectral properties of biological time series in addition to temporal features. A Stanford research group proposed subject agnostic features as state-of-the-art when applied to a large dataset with many participants of different ages. In this dissertation, we demonstrate that implicit feature learning in the latent spaces of deep learning algorithms can be powerful alternatives to using finely tuned domain-specific features for human activity recognition. In fact, when using a spectrotemporal representation of the raw sensor data in the form of spectrograms, a standard convolutional neural network without any prior conditioning on the features, statistically significantly outperforms the prior state-of-the-art using subject agnostic features in all the different partitions of the dataset with a significant 29.8% reduction in the overall average error rate.

Finally, we look at one of the most important practical challenges for human activity recognition where a commercial algorithm can achieve very low accuracies for some outlier subjects. In this context, we propose a method to exploit a source model to fine tune the parameters for each specific subject to enhance their classification performances. In the literature, most of the research follow the mean classification accuracy as a performance metric. However, some outlier users which provide low accuracies in classification can demote the overall performance of a motion recognition system when compared to the median accuracy reported on any given dataset. To find a solution to the problem, we study several approaches in determining the impact of outlier users on activity recognition task and propose a novel approach, sub-transfer learning which demonstrate that the principles of transfer learning can be applied within the same dataset when coupled with augmentation techniques. Our results show that on the most difficult users with the lowest subject specific accuracies, our performance gains can be as much as 15% when using only a few additional samples for re- tuning. Finally, we demonstrate that the performance improvements of the proposed model are statistically significantly better than the source or subject-specific models across a range of datasets with demographically diverse users and sensor locations.

Share

COinS