Graduation Year

2023

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

John Licato, Ph.D.

Committee Member

Shaun Canavan, Ph.D.

Committee Member

Kelsey Merlo, Ph.D.

Committee Member

Nicole Beckage, Ph.D.

Committee Member

Ismail Uysal, Ph.D.

Committee Member

Michael Maness, Ph.D.

Keywords

language model, NLP, reliability, validity, artificial intelligence

Abstract

Large language models (LLMs) are poised to transform both academia and industry. But the excitement around these generative AIs has also been met with concern for the true extent of their capabilities. This dissertation helps to address these questions by examining the capabilities of LLMs using the tools of psychometrics. We focus on analyzing the capabilities of LLMs on the task of natural language inference (NLI), a foundational benchmark often used to evaluate new models. We demonstrate that LLMs can reliably predict the psychometric properties of NLI items were those items administered to humans. Through a series of experiments, we show that LLMs can improve the validity and reliability of NLI by both helping to refine the operationalization of theconstruct and by automatically generating new items with superior validity evidence. Finally, in a related line of work, we demonstrate that LLMs can predict the age at which children acquire words in their vocabulary. Our research demonstrates the potential of applying the tools of psychometrics to the analysis of generative AI and paves the way for creating better AI assessments.

Share

COinS