Rater Consistency in Rating L2 Learners’ Writing Task

Nguyen Thi Quynh Yen

Main Article Content

Abstract

Abstract: Rater consistency plays a critical part in the rating procedure. Test scores will be unreliable if examiners are inconsistent in their rating and fail to agree with other raters on the relative merits of rating scale, severity and leniency and so on. Despite the difficulty in matching the standard, writing paper is widely used in various kinds of language tests because it can provide not only a high motivation for writing, but also an excellent backwash effect on teaching. For this reason, it is necessary to establish high consistency in the scores given by one rater (intra-rater reliability) and by different raters (inter-rater reliability). This article discusses rater consistency in essay evaluation conducted by some randomly chosen raters in the Faculty of English, theUniversity ofLanguages and International Studies, VNU and from that, some suggestions are made to improve the reliability in rating L2 learners’ essay writing.

Keywords: Rater consistency, intra-rater reliability, inter-rater reliability, holistic scoring, analytical scoring.

Article Details

References

[1] McNamara, T. (2000). Language Testing. Oxford: Oxford University Press.
[2] Weir, C J. (2005). Language Testing and Validation. Palgrave Macmillan.
[3] Bachman, L. (1990). Fundamental Considerations in Language Testing. Oxford: Oxford University Press.
[4] Babin, E and Harrison, K. (1999). Contemporary Composition Studies: A Guide to Theorists and Terms. Greenwood Press.
[5] Hughes, A. (1989). Testing for Language teachers. Cambridge: Cambridge University Press.
[6] Milanovic, M, Saville, N and Shuhong, S. (1996). A Study of the Decision – making Behaviour of Composition Markers, in Milanovic, M and Saville, N (Eds) Performance Testing, Cognition and Assessment: Selected Papers from the 15th Language Testing Research Colloquium and Arnhem, Studies in Language Testing 3. Cambridge: UCLES/Cambridge University Press.
[7] Orr, M. (2002). The FCE Speaking Test: Using Rater Reports to Help Interpret Test Scores. System, 30.
[8] O’Donnell, D, Thomas, G and Park, S. (2006). Revisiting Assessment Criteria in a Speaking test. Paper Presented at JALT2006 Annual Conference, Kitakyushu, Japan.
[9] Brown, A, Iwashita, N and McNamara, T. (2005). An Examination of Rater Orientations and Test-taker Performance on English-for-Academic-Purposes Speaking Tasks. Princeton, NJ: ETS.
[10] Daly, J. A and Dickson- Markman, F. (1982). Contrast effects in evaluating essays. Journal of Educational Measurement 19: 309-316
[11] Hughes, D.C., Keeling, B. and Tuck, B.F. (1980). The influence of context position and scoring method on essay scoring. Journal of Educational Measurement 17: 131 -135.
[12] Alderson, J C, Clapham, C and Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press.
[13] McNamara, T. (1996). Measuring Second Language Performance. London: Longman.
[14] Hamp-Lyons, L. (1991). Scoring Procedures for ESL Contexts. In Hamp-Lyons: 241-278.
[15] Hout, B. (1996). Toward a new theory of writing assessment. College Composition and Communication 47.
[16] Weir, C J. (1990). Communicative Language Testing. Englewood Cliffs, NJ: Prentice Hall.
[17] Taylor, L and Falvey, P (Eds). (2007). IELTS Collected Papers: Research in speaking and writing assessment. Studies in Language Testing 19. Cambridge: Cambridge University Press.