Multi-faceted Rasch measurement and bias patterns in EFL writing performance assessment

Tung Hsien He, Wen Johnny Gou, Ya Chen Chien, Ia Shan Jenny Chen, Shanmao Frank Chang

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

This study applied multi-faceted Rasch measurement to examine rater bias in the assessment of essays written by college students learning English as a foreign language. Four raters who had received different academic training from four distinctive disciplines applied a six-category rating scale to analytically rate essays on an argumentative topic and on a descriptive topic. FACETS, a Rasch computer program, was utilized to pinpoint bias patterns by analyzing the rater-topic, rater-category, and topic-category interactions. Results showed: argumentative essays were rated more severely than were descriptive essays; the linguistics-major rater was the most lenient rater, while the literature-major rater was the severest one; and the category of language use received the severest ratings, whereas content was given the most lenient ratings. The severity hierarchies for raters, essay topics, and rating categories suggested that raters' academic training and their perceptions of the importance of categories were associated with their bias patterns. Implications for rater training are discussed.

Original languageEnglish
Pages (from-to)469-485
Number of pages17
JournalPsychological Reports
Volume112
Issue number2
DOIs
Publication statusPublished - 2013 Apr 1

Fingerprint

Teaching
Language
Linguistics
Software
Learning
Students

All Science Journal Classification (ASJC) codes

  • Psychology(all)

Cite this

He, Tung Hsien ; Gou, Wen Johnny ; Chien, Ya Chen ; Chen, Ia Shan Jenny ; Chang, Shanmao Frank. / Multi-faceted Rasch measurement and bias patterns in EFL writing performance assessment. In: Psychological Reports. 2013 ; Vol. 112, No. 2. pp. 469-485.
@article{ba3fc215a2714b6d9bdae102684051ed,
title = "Multi-faceted Rasch measurement and bias patterns in EFL writing performance assessment",
abstract = "This study applied multi-faceted Rasch measurement to examine rater bias in the assessment of essays written by college students learning English as a foreign language. Four raters who had received different academic training from four distinctive disciplines applied a six-category rating scale to analytically rate essays on an argumentative topic and on a descriptive topic. FACETS, a Rasch computer program, was utilized to pinpoint bias patterns by analyzing the rater-topic, rater-category, and topic-category interactions. Results showed: argumentative essays were rated more severely than were descriptive essays; the linguistics-major rater was the most lenient rater, while the literature-major rater was the severest one; and the category of language use received the severest ratings, whereas content was given the most lenient ratings. The severity hierarchies for raters, essay topics, and rating categories suggested that raters' academic training and their perceptions of the importance of categories were associated with their bias patterns. Implications for rater training are discussed.",
author = "He, {Tung Hsien} and Gou, {Wen Johnny} and Chien, {Ya Chen} and Chen, {Ia Shan Jenny} and Chang, {Shanmao Frank}",
year = "2013",
month = "4",
day = "1",
doi = "10.2466/03.11.PR0.112.2.469-485",
language = "English",
volume = "112",
pages = "469--485",
journal = "Psychological Reports",
issn = "0033-2941",
publisher = "Ammons Scientific Ltd",
number = "2",

}

Multi-faceted Rasch measurement and bias patterns in EFL writing performance assessment. / He, Tung Hsien; Gou, Wen Johnny; Chien, Ya Chen; Chen, Ia Shan Jenny; Chang, Shanmao Frank.

In: Psychological Reports, Vol. 112, No. 2, 01.04.2013, p. 469-485.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Multi-faceted Rasch measurement and bias patterns in EFL writing performance assessment

AU - He, Tung Hsien

AU - Gou, Wen Johnny

AU - Chien, Ya Chen

AU - Chen, Ia Shan Jenny

AU - Chang, Shanmao Frank

PY - 2013/4/1

Y1 - 2013/4/1

N2 - This study applied multi-faceted Rasch measurement to examine rater bias in the assessment of essays written by college students learning English as a foreign language. Four raters who had received different academic training from four distinctive disciplines applied a six-category rating scale to analytically rate essays on an argumentative topic and on a descriptive topic. FACETS, a Rasch computer program, was utilized to pinpoint bias patterns by analyzing the rater-topic, rater-category, and topic-category interactions. Results showed: argumentative essays were rated more severely than were descriptive essays; the linguistics-major rater was the most lenient rater, while the literature-major rater was the severest one; and the category of language use received the severest ratings, whereas content was given the most lenient ratings. The severity hierarchies for raters, essay topics, and rating categories suggested that raters' academic training and their perceptions of the importance of categories were associated with their bias patterns. Implications for rater training are discussed.

AB - This study applied multi-faceted Rasch measurement to examine rater bias in the assessment of essays written by college students learning English as a foreign language. Four raters who had received different academic training from four distinctive disciplines applied a six-category rating scale to analytically rate essays on an argumentative topic and on a descriptive topic. FACETS, a Rasch computer program, was utilized to pinpoint bias patterns by analyzing the rater-topic, rater-category, and topic-category interactions. Results showed: argumentative essays were rated more severely than were descriptive essays; the linguistics-major rater was the most lenient rater, while the literature-major rater was the severest one; and the category of language use received the severest ratings, whereas content was given the most lenient ratings. The severity hierarchies for raters, essay topics, and rating categories suggested that raters' academic training and their perceptions of the importance of categories were associated with their bias patterns. Implications for rater training are discussed.

UR - http://www.scopus.com/inward/record.url?scp=84878261951&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878261951&partnerID=8YFLogxK

U2 - 10.2466/03.11.PR0.112.2.469-485

DO - 10.2466/03.11.PR0.112.2.469-485

M3 - Article

C2 - 23833876

AN - SCOPUS:84878261951

VL - 112

SP - 469

EP - 485

JO - Psychological Reports

JF - Psychological Reports

SN - 0033-2941

IS - 2

ER -