Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Mathematics and Statistics

Major Professor

Kandethody Ramachandran, Ph.D.

Committee Member

Ming Ji, Ph.D.

Committee Member

Christos P. Tsokos, Ph.D.

Committee Member

Yicheng Tu, Ph.D.


Online Empirical Bayesian, Time series segmentation, Kernel Density, Random Forest, human pattern recognition


Time series analysis has been explored by the researchers in many areas such, as statistical research, engineering applications, medical analysis, and finance study. To represent the data more efficiently, the mining process is supported by time series segmentation. Time series segmentation algorithm looks for the change points between two different patterns and develops a suitable model, depending on the data observed in such segment. Based on the issue of limited computing and storage capability, it is necessary to consider an adaptive and incremental online segmentation method. In this study, we propose an Online Empirical Bayesian Kernel Segmentation (OBKS), which combines Online Multivariate Kernel Density Estimation (OMKDE) and Online Empirical Bayesian Segmentation (OBS) algorithm. This innovative method considers Online Multivariate Kernel density as a predictive distribution derived by Online Empirical Bayesian segmentation instead of using posterior predictive distribution as a predictive distribution. The benefit of Online Multivariate Kernel Density Estimation is that it does not require the assumption of a pre-defined prior function, which makes the OMKDE more adaptive and adjustable than the posterior predictive distribution.

Human Activity Recognition (HAR) by smartphones with embedded sensors is a modern time series application applied in many areas, such as therapeutic applications and sensors of cars. The important procedures related to the HAR problem include classification, clustering, feature extraction, dimension reduction, and segmentation. Segmentation as the first step of HAR analysis attempts to represent the time interval more effectively and efficiently. The traditional segmentation method of HAR is to partition the time series into short and fixed length segments. However, these segments might not be long enough to capture the sufficient information for the entire activity time interval. In this research, we segment the observations of a whole activity as a whole interval using the Online Empirical Bayesian Kernel Segmentation algorithm as the first step. The smartphone with built-in accelerometer generates observations of these activities.

Based on the segmenting result, we introduce a two-layer random forest classification method. The first layer is used to identify the main group; the second layer is designed to analyze the subgroup from each core group. We evaluate the performance of our method based on six activities: sitting, standing, lying, walking, walking\_upstairs, and walking\_downstairs on 30 volunteers. If we want to create a machine that can detect walking\_upstairs and walking\_downstairs automatically, it requires more information and more detail that can generate more complicated features, since these two activities are very similar. Continuously, considering the real-time Activity Recognition application on the smartphones by the embedded accelerometers, the first layer classifies the activities as static and dynamic activities, the second layer classifies each main group into the sub-classes, depending on the first layer result. For the data collected, we get an overall accuracy of 91.4\% based on the six activities and an overall accuracy of 100\% based only on the dynamic activity (walking, walking\_upstairs, walking\_downstairs) and the static activity (sitting, standing, lying).