Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Nagarajan Ranganathan, Ph.D.

Co-Major Professor

Srinivas Katkoori, Ph.D.

Committee Member

Adriana Iamnitchi, Ph.D.

Committee Member

Jay Ligatti, Ph.D.

Committee Member

Kandethody Ramachandran, Ph.D.

Committee Member

Babu Joseph, Ph.D.


Big Data, Distributed Systems, Intrusion Detection, Control Flow, Statistical Analysis


In big data systems, the infrastructure is such that large amounts of data are hosted away from the users. Information security is a major challenge in such systems. From the customer’s perspective, one of the big risks in adopting big data systems is in trusting the service provider who designs and owns the infrastructure, with data security and privacy. However, big data frameworks typically focus on performance and the opportunity for including enhanced security measures is limited. In this dissertation, the problem of mitigating insider attacks is extensively investigated and several static and dynamic run-time techniques are developed. The proposed techniques are targeted at big data systems but applicable to any data system in general.

First, a framework is developed to host the proposed security techniques and integrate with the underlying distributed computing environment. We endorse the idea of deploying this framework on special purpose hardware and a basic model of the software architecture for such security coprocessors is presented. Then, a set of compile-time and run-time techniques are proposed to protect user data from the perpetrators. These techniques target detection of insider attacks that exploit data and infrastructure. The compile-time intrusion detection techniques analyze the control flow by disassembling program binaries while the run-time techniques analyze the memory access patterns of processes running on the system.

The proposed techniques have been implemented as prototypes and extensively tested using big data applications. Experiments were conducted on big data frameworks such as Hadoop and Spark using cloud-based services. Experimental results indicate that the proposed techniques successfully detect insider attacks in the context of data loss, data degradation, data exposure and infrastructure degradation.