Graduation Year

2018

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Sriram Chellappan, Ph.D.

Co-Major Professor

Mingyang Li, Ph.D.

Committee Member

Srinivas Katkoori, Ph.D.

Committee Member

Mehran Mozaffari Kermani, Ph.D.

Committee Member

Jean-Francois Biasse, Ph.D.

Keywords

Classification, Machine Learning, Network Forensics, Time Series, User Behavior

Abstract

Understanding Internet user behavior and Internet usage patterns is fundamental in developing future access networks and services that meet technical as well as Internet user needs. User behavior is routinely studied and measured, but with different methods depending on the research discipline of the investigator, and these disciplines rarely cross. We tackle this challenge by developing frameworks that the Internet usage statistics used as the main features in understanding Internet user behaviors, with the purpose of finding a complete picture of the user behavior and working towards a unified analysis methodology. In this dissertation we collected Internet usage statistics via privacy-preserving NetFlow logs of 66 student subjects in a college campus was recorded for a month long period. Once the data is cleaned and split into different groups based on different time windows, we have used Statistical Analysis and we found that Internet usage of each user exhibits statistically-strong correlation with the same user's Internet usage for the same day over multiple weeks while it is statistically different from that of other Internet users. In another attempt we have used Time Series Forecasting in order to forecast future Internet usage based on the previous statistics. Subsequently, using state-of-the-art Machine Learning algorithms, we demonstrate the feasibility of profiling Internet users by looking at their Internet traffic. Specifically, when profiled over a time window of 227-second, subjects can be classified by 93.21% precision accuracy. We conclude that understanding Internet usage behavior is valuable and can help in developing future access networks and services.

Share

COinS