Graduation Year

2025

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Sriram Chellappan, Ph.D.

Co-Major Professor

Ryan M. Carney, Ph.D.

Committee Member

Ankur Mali, Ph.D.

Committee Member

John Murray-Bruce, Ph.D.

Committee Member

Trung Le, Ph.D.

Keywords

Artificial Intelligence, Citizen Science, Computer Vision, Explainable AI, Machine Learning

Abstract

According to the World Health Organization (WHO), mosquitoes are the deadliest animals on Earth, responsible for more human deaths annually than any other species. Mosquito-borne illnesses continue to pose severe risks to global health. In 2015 alone, there were an estimated 214 million malaria cases worldwide. Similarly, a 2016 report from the Centers for Disease Control and Prevention (CDC) revealed that Puerto Rico’s Department of Health received over 62,500 suspected cases of Zika, with 29,345 confirmed positive cases. In 2019, Southeast Asia experienced its worst dengue outbreak in recorded history. Of the approximately 4,500 mosquito species distributed across 34 genera, only a select few act as primary vectors of disease. Most disease transmission is attributed to mosquitoes from three key genera: Aedes (Ae.) , Anopheles (An.) , and Culex (Cx.) Within these groups, specific species are associated with particular diseases—for instance, An. gambiae is a major malaria vector in Africa, while An. stephensi plays a similar role in India. Ae. aegypti is known for spreading dengue, yellow fever, chikungunya, and Zika, whereas Cx. nigrip is a carrier of West Nile virus and various types of encephalitis.

Because not all mosquitoes are capable of transmitting disease, identifying the vectors during an outbreak becomes a critical first step in disease control. Public health teams often deploy mosquito traps across affected regions, capturing hundreds of specimens. However, only a fraction of these are vectors, and it becomes essential to accurately identify them to estimate population density and transmission risk. Currently, this identification process relies heavily on visual examination by trained taxonomists, who must inspect each specimen under a microscope. This method is not only time-consuming but also mentally demanding, placing significant strain on personnel responsible for the classification and documentation of trapped mosquitoes. In this dissertation, we present a comprehensive, interpretable, and field-ready AI framework for mosquito classification that bridges the gap between academic research and real-world vector surveillance efforts.

The first stage of the work, we examine the feasibility of classifying the gonotrophic stages (i.e., reproductive states) of mosquitoes from three medically significant genera— Aedes , Anopheles , and Culex A novel dataset was collected from 139 adult female mosquitoes across all four gonotrophic stages of the cycle (unfed, fully fed, semi-gravid, and gravid). From these mosquitoes and stages, a total of 1,959 images were captured on a plain background via multiple smartphones. Subsequently, we trained four distinct AI model architectures ( Resnet50 , MobileNetV2 , EfficientNet-B0 , and ConvNeXtTiny ), validated them using unseen data, and compared their overall classification accuracies. Additionally, we analyzed t-SNE plots to visualize the formation of decision boundaries in a lower-dimensional space. Notably, EfficientNet-B0 demonstrated outstanding performance with an overall accuracy of 93.59% with better decision boundaries. We also assessed the explainability of our AI model, by implementing Grad-CAMs - a technique that highlights pixels in an image that were prioritized for classification. We observe that the highest significance was for those pixels representing the mosquito abdomen, demonstrating that our AI model has indeed learned correctly. This work establishes the potential of machine learning in addressing entomological classification tasks beyond species identification.

As a next step, we expanded existing work on adult mosquito classification and segmentation by incorporating a new vector species, Aedes scapularis , and refining segmentation labels for enhanced anatomical precision. Leveraging transfer learning, both the species classification and segmentation models were retrained on augmented datasets, demonstrating measurable improvements in generalization and predictive performance. This work also highlights the flexibility and adaptability of our deep learning pipeline to accommodate new species and updated annotations. As our global image database continues to grow with contributions from diverse geographic locations, this retraining strategy allows the model to continuously learn new features while retaining previously acquired knowledge. This ensures long-term scalability and robustness of the system, enabling it to evolve alongside real-world vector surveillance needs. We further bridge research and application by developing an interactive web-based dashboard that allows users, such as vector control teams or citizen scientists, to upload mosquito images and receive immediate classification and segmentation feedback. This platform marks a crucial step toward operationalizing AI tools for public health interventions.

After achieving encouraging results from our preliminary work, we focused to the problem of early detection of invasive mosquito species, focusing on a real-world case involving Anopheles stephensi in Madagascar. Using over 1,400 expertly labeled larval images from 8 mosquito species across 3 genera, we trained and evaluated binary and multi-class deep learning models at various taxonomic levels. A field-submitted image from Antananarivo, captured through the NASA GLOBE Observer app, was consistently classified as An. stephensi across all trained models with high confidence, a result confirmed through test-time augmentation and cross-validation. This study illustrates the power of deep learning in interpreting noisy, field-captured data and supporting early-warning surveillance for emerging vectors.

Continuing this stephensi classification thread, we explored the application of advanced deep learning architectures for the classification of Anopheles stephensi mosquitoes in both larvae and adult stages, addressing the significant challenge of class imbalance in ecological datasets. Given the practical significance of detecting this particular vector among many other mosquitoes in nature, we focus our study in this paper on class imbalance. Specifically, our dataset is imbalanced (just like it will be in nature), consisting of 1195 images of stephensi mosquitoes and 6021 images that are not stephensi mosquitoes, both of which are taken via modern smartphones in varying backgrounds. We employed three state-of-the-art models — MobileNetV2 , EfficientNet-B1 , and NasNetMobile — and applied class balancing techniques such as down-sampling and focal loss to emphasize the minor class data. We assessed their performance on several performance metrics. Our findings reveal that NasNetMobile outperformed the other models in larvae classification, achieving 97.66 % accuracy, while EfficientNet-B1 excelled in adult mosquito classification with 99.62 % accuracy. The implementation of focal loss effectively mitigated class imbalance, significantly improving sensitivity and precision across all models. Additionally, Grad-CAM visualizations confirmed that the models focused on biologically relevant features, enhancing interpretability. This work highlights the potential of deep learning techniques in improving mosquito surveillance and vector control strategies, ultimately contributing to public health initiatives aimed at combating malaria and other mosquito-borne diseases.

Together, this dissertation form a cohesive pipeline for mosquito classification that not only achieves high accuracy but also promotes explainability and usability. The integration of reproductive stage detection, species-level classification, anatomical segmentation, single vector detection, and real-world deployment via a user dashboard represents a holistic approach to entomological surveillance. Each phase—from dataset design to interactive tooling—was built with considerations for public health deployment, particularly in resource-limited settings where timely mosquito identification can be the difference between containment and outbreak. This work stands as a step forward in the development of intelligent surveillance systems that empower public health authorities to respond proactively to vector-borne threats in an era of climate change, urban expansion, and global vector migration.

Share

COinS