Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Curriculum and Instruction

Major Professor

Yiping Lou, Ph.D.

Committee Member

Eun Sook Kim, Ph.D.

Committee Member

Yi-Hsin Chen, Ph.D.

Committee Member

James Hatten, Ph.D.


Assessment literacy, professional development, item development, teacher training


The purpose of this study was to examine the process and impact of assessment training content and delivery mode on the quality of assessment items developed by the teachers in a two-year assessment development project. Teacher characteristics were examined as potential moderating factors. Four types of delivery mode were employed in the project: synchronous online, asynchronous online, in-person workshop, and blended (a combination of online and in-person training). The quality of assessment items developed by participating teachers was measured via: 1) item acceptance rate, 2) number of item reviews (as an indicator of how many times accepted items were rejected before being approved), and 3) psychometric properties of the items (item difficulty and item discrimination) in the field test data.

A teacher perception survey with quantitative and qualitative data was used to explore teacher perception of the training across the four modes and the anticipated impact of the project participation the teachers expected on their classroom assessment practices.

Multilevel modeling and multiple regression were used to examine the quality of items developed by participants, while constant comparative analysis, a chi-square test, and ANOVA were employed to analyze participants’ responses to a participation survey.

No pre-existing teacher variables were found to have a significant impact on the item discrimination values, though prior assessment development experience beyond that of the classroom level was found to have a significant relationship with the number of reviews per item. After controlling for prior assessment development experience, participant role was found to have a significant (p < .01) impact on the number of reviews per item. Items written by participants who served as both item writers and reviewers had a significantly lower number of reviews per item, meaning their items were rejected less frequently than items written by participants who served as item writers only. No differences in item quality were found based on the mode of training in which item writers participated.

Responses to the training evaluation survey differed significantly by mode of training at p < .001. The in-person trained group had the lowest total rating, followed by the online asynchronous group, while the online synchronous group had the highest overall rating of the training. Participant responses to open-ended questions also differed significantly by mode of training.