Consistency of Rating Accuracy and Rater Errors in the Judgment of Human Performance

Document Type


Publication Date



Videotapes were developed depicting persons performing on two jobs. Fourteen expert judges then carefully assigned multidimensional ratings of effectiveness to each performer. Interjudge agreement among the experts was high (median intraclass r = .97), providing indirect validity evidence for these performance judgments and justifying the use of mean expert ratings as criterion “true scores” of effectiveness. The true scores were next used as criteria against which to judge the differential accuracy (Cronbach, L. J. Processes affecting scores on understanding of others and assuming “similarity.” Psychological Bulletin 1955, 52 177–193) of each of the 146 college-student raters who viewed the tapes. Scores reflecting halo, leniency/severity, and restriction of range rating errors were also computed for each student rater. Analysis of the ratings showed that within-job consistency of rating accuracy is approximately as high as shown in comparable studies requiring the perception of personality or the postdiction of behavior or opinions. Across-job consistency of this “ability” to perceive performance accurately is moderate (intraclass r = .46), suggesting that the ability may be somewhat situation specific. Of the rating errors, all were consistent within job, with halo and restriction of range reasonably consistent across the two jobs. The tendency to be lenient or severe in ratings was not consistent across jobs.

Was this content written or created while at USF?


Citation / Publisher Attribution

Organizational Behavior and Human Performance, v. 20, issue 2, p. 238-252