Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Sriram Chellappan, Ph.D.

Committee Member

Srinivas Katkoori, Ph.D.

Committee Member

Mehran Mozaffari Kermani, Ph.D.

Committee Member

Nasir Ghani, Ph.D.

Committee Member

Theresa Beckie, Ph.D.


Computer Vision, CNN, Deep Learning, Distracted Driving, Object Detection, Human-Centered Computing


Artificial Intelligence techniques have ensued a significant impact on our daily lives. Numerous applications in so many diverse fields have been made possible by AI algorithms today, and there are many more yet to come. In this dissertation, we design, deploy and validate computer vision algorithms for innovative and high-impact societal scale applications.We specifically focus on two applications in this dissertation: Detection of distracted driving and Detection of breeding habitats of mosquito vectors.

Distracted driving on roads is a major problem around the world. Distracted driving is the case where a driver diverts his/her focus from the road and engages in other activities (e.g., texting, calling, drinking, playing the radio, eating, sleeping, etc.), which will cause visual, manual, and cognitive distraction. There is increasing push from various stakeholders to invent new technology-enabled methods for detecting when drivers are driving distracted in real-time. In this dissertation, we design a computer vision technique that processes images captured inside of cars to automatically detect instances of distracted driving. Furthermore, our innovation lies in adding contextual feedback with the classification of distracted driving. We present our contributions in two chapters. First, we considered six classes of driving activities; among them, five classes are distracted driving and one safe driving class. These distracted classes are texting right hand, texting left hand, talking left hand, talking right hand, and drinking. We utilized a well-established dataset (described later) that has numerous images of drivers engaged in distracted driving, with images captured via a camera in the car. To detect instances of distracted driving, we employ a context-driven approach, wherein we first detect objects in a car that can contribute to distracted driving. These are left and right hand, steering wheel, smart phone, and bottle. To do so, we first design an object detection model based on Faster R-CNN architecture. Once the object is detected, we designed a simple machine learning technique to classify activities as distracted or safe, based on the relative locations of these objects in the image.

Next, using the same dataset, we increased the total number of driving classes to ten, where nine of them are distracted driving class and one safe driving class. Along with the previous five distracted driving classes (texting right hand, texting left hand, talking left hand, talking right hand and drinking), the four newly added distracted driving classes are operating the car radio, looking back, doing makeup, and talking to side passenger. The number of objects that cause distracted driving also increased, and we have considered a total of nine objects this time, namely, left and right hand, steering wheel, smart phone, bottle, radio, face look straight, face look back, and face look right in this chapter. After that, for the feature extraction, we employ the ResNet-101 network [1], due to its much lower complexity, and improved accuracy. In our first study with fewer objects and classes, we have achieved 75% classification accuracy and in our second study with more objects and classes, we could improve the accuracy to 94%. We believe our proposed methods are fast, practical, and context-aware.

Later in this dissertation, we have also investigated another significant societal scale problem - namely combating mosquito vectors in nature. To do so, we designed computer vision techniques to detect mosquito breeding habitats from Unmanned Aerial vehicles (UAV) videos. For this study, we have collected UAV footage from Rwanda – a country in sub Saharan Africa where malaria is endemic. To tackle this problem, we designed a Mask Region-based Convolutional Neural Network (Mask R-CNN) on the video/image data to automatically detect and geo-locate potential mosquito breeding habitats and determine habitat sizes. The overall goal is to engage citizens proactively to destroy such habitats using natural methods to combat mosquito vectors and hence malaria.

We believe that our research in this dissertation enables the creation of innovation applications for the greater good using AI, and can generate future work in this space to serve humanity.