This study applied multi-faceted Rasch measurement to examine rater bias in the assessment of essays written by college students learning English as a foreign language. Four raters who had received different academic training from four distinctive disciplines applied a six-category rating scale to analytically rate essays on an argumentative topic and on a descriptive topic. FACETS, a Rasch computer program, was utilized to pinpoint bias patterns by analyzing the rater-topic, rater-category, and topic-category interactions. Results showed: argumentative essays were rated more severely than were descriptive essays; the linguistics-major rater was the most lenient rater, while the literature-major rater was the severest one; and the category of language use received the severest ratings, whereas content was given the most lenient ratings. The severity hierarchies for raters, essay topics, and rating categories suggested that raters' academic training and their perceptions of the importance of categories were associated with their bias patterns. Implications for rater training are discussed.
All Science Journal Classification (ASJC) codes