Item response curve, science and mathematics education, assessment
We introduce an approach for making a quantitative comparison of the item response curves (IRCs) of any two populations on a multiple-choice test instrument. In this study, we employ simulated and actual data. We apply our approach to a dataset of 12,187 participants on the 25-item Science Literacy Concept Inventory (SLCI), which includes ample demographic data of the participants. Prior comparisons of the IRCs of different populations addressed only two populations and were made by visual inspection. Our approach allows for quickly comparing the IRCs for many pairs of populations to identify those items where substantial differences exist. For each item, we compute the IRC dot product, a number between 0 and 1 for which a value of 1 occurs when the IRCs of the two populations are identical. We then determine whether the value of the IRC dot product is indicative of significant differences in populations of real students. Through this process, we can quickly discover bias across demographic groups. As a case example, we apply our metric to illuminate four SLCI items that exhibit gender bias. We further found that gender bias was present for non-science majors on those items but not for science majors.
Walter, Paul J., Edward Nuhfer, and Crisel Suarez. "Probing for Bias: Comparing Populations Using Item Response Curves." Numeracy 14, Iss. 1 (2021): Article 2. DOI: https://doi.org/10.5038/1936-46188.8.131.527
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License