Graduation Year

2025

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Electrical Engineering

Major Professor

Ghulam Rasool, Ph.D.

Co-Major Professor

Yasin Yilmaz, Ph.D.

Committee Member

Issam El Naqa, Ph.D.

Committee Member

Mia Naeini, Ph.D.

Committee Member

Alex Otten, Ph.D.

Keywords

Cancer, Large Language Models, Machine Learning, Robustness, Uncertainty Quantification

Abstract

This dissertation presents a cohesive set of novel frameworks developed to address critical challenges in oncology data integration, representation learning, and clinical information extraction. The work encompasses four interconnected projects: MINDS (Multimodal Integration of Oncology Data System), HoneyBee (Harmonized ONcologY Biomedical Embedding Encoder), LLM Extraction (Large Language Model-based Extraction from Pathology Reports), and EAGLE (Embedding Analysis for Generalized Learning in Oncology). Together, these systems enable the unification of diverse cancer data modalities—from genomics and clinical records to histopathology images and radiological scans—creating a robust foundation for advanced machine learning applications in precision oncology. By addressing key barriers in data accessibility, standardization, and analysis, this work establishes a comprehensive technical infrastructure for accelerating cancer research and improving clinical decision-making through AI-augmented methodologies.

Included in

Engineering Commons

Share

COinS