Graduation Year

2025

Document Type

Thesis

Degree

M.S.

Degree Name

Master of Science (M.S.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Ning Wang, Ph.D.

Committee Member

Ankur Mali, Ph.D.

Committee Member

Nasir Ghani, Ph.D.

Committee Member

Guangjing Wang, Ph.D.

Keywords

Anomaly Detection, Cybersecurity, Deep Learning, Network Security, Threat Detection

Abstract

Network Intrusion Detection Systems (NIDS) play a critical role in identifying and mitigating malicious activities within computer networks. With the rapid evolution of natural language processing (NLP), Large Language Models (LLMs) have emerged as transformative tools across various domains. LLMs, such as OpenAI’s GPT series and Meta’s LLaMA models,have demonstrated remarkable performance in tasks like language generation, reasoning, and classification. Their ability to understand and process vast amounts of data has enabled groundbreaking advancements in areas like healthcare, finance, and cybersecurity. Recent trends highlight their potential to handle unstructured data, perform complex reasoning, and adapt to a wide range of applications, making them a promising technology for enhancing NIDS.

This thesis explores the application of advanced NLP techniques, particularly LLMs, to enhance NIDS performance. We investigate multiple approaches, including the use of Masked Language Models (MLMs) such as BERT, RoBERTa, and DistilBERT, as well as large-scale generative models like Gemma (2B, 9B, and 27B parameter versions) for intrusion detection tasks.

Our study begins by implementing standard machine learning models - k-Nearest Neighbors (kNN), Random Forest, XGBoost, Support Vector Machines (SVM), and Logistic Regression on two benchmark datasets: NSL-KDD and CICIoT2023. These models establish a performance baseline for NIDS tasks. Subsequently, we apply MLMs to classify network traffic, both as standalone classifiers and as feature extractors, by converting the datasets into dense embeddings and using them with standard models. To further analyze the efficacy of LLMs, we conduct experiments by prompting Gemma models with sampled datasets of varying sizes (1000, 5000, 10,000 rows). The experiments encompass different prompting strategies, including Zero-Shot, One-Shot, In-Context Learning, In-Context Learning with Coverage-Based selection algorithm, and Chain-of-Thought reasoning. Each experiment is conducted with two variants: one using selected features and another using the entire feature set. Our analysis focuses on how model size, data representation, and prompting methods impact classification performance.

Scholar Commons Citation

Balaji, Sudharshan, "LLMs in Network Intrusion Detection – A Comprehensive Analysis" (2025). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/10920

Download

Included in

Computer Sciences Commons

COinS

USF Tampa Graduate Theses and Dissertations

LLMs in Network Intrusion Detection – A Comprehensive Analysis

Graduation Year

Document Type

Degree

Degree Name

Degree Granting Department

Major Professor

Committee Member

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Search

Browse By

Useful Links

USF Tampa Graduate Theses and Dissertations

LLMs in Network Intrusion Detection – A Comprehensive Analysis

Author

Graduation Year

Document Type

Degree

Degree Name

Degree Granting Department

Major Professor

Committee Member

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Share

Search

Browse By

Useful Links