Start Date
5-12-2025 12:00 PM
End Date
5-12-2025 1:00 PM
Description
This project focuses on fine-tuning the open-source Gemma 3 model using Low-Rank Adaptation (LoRA) to generate pedagogically aligned practice questions for an Analysis of Algorithms course. The authors created a dataset consisting of 300 question-hint-answer conversations with progressive hints that would scaffold student reasoning. Gemma 3 was fine-tuned and evaluated against a prompt-only baseline using an LLM-as-a-judge framework (Gemini 2.5 Pro) across 187 question pairs. The results show that the fine-tuned model improved significantly in avoiding premature solution reveals, difficulty appropriateness, and pedagogical alignment. However, the model experienced decreases in hint quality, clarity, and correctness, which is hypothesized to be due to a small training dataset focused heavily on hiding the final answers. The study shows that fine-tuning can effectively transform general LLMs into course-specific helpers that support productive learning, though future work must expand and balance the training datasets to improve overall clarity and correctness.
Fine-Tuning Open Source LLMs for Generating Pedagogically Aligned Practice Questions in University Level Algorithms Course
This project focuses on fine-tuning the open-source Gemma 3 model using Low-Rank Adaptation (LoRA) to generate pedagogically aligned practice questions for an Analysis of Algorithms course. The authors created a dataset consisting of 300 question-hint-answer conversations with progressive hints that would scaffold student reasoning. Gemma 3 was fine-tuned and evaluated against a prompt-only baseline using an LLM-as-a-judge framework (Gemini 2.5 Pro) across 187 question pairs. The results show that the fine-tuned model improved significantly in avoiding premature solution reveals, difficulty appropriateness, and pedagogical alignment. However, the model experienced decreases in hint quality, clarity, and correctness, which is hypothesized to be due to a small training dataset focused heavily on hiding the final answers. The study shows that fine-tuning can effectively transform general LLMs into course-specific helpers that support productive learning, though future work must expand and balance the training datasets to improve overall clarity and correctness.