Writing assessments often consist of students responding to multiple prompts, which are judged by more than one rater. To establish the reliability of these assessments, there exist different methods to disentangle variation due to prompts and raters, including classical test theory, Many Facet Rasch Measurement (MFRM), and Generalizability Theory (G-Theory). Each of these methods defines a standard error of measurement (SEM), which is a quantity that summarizes the overall variability of student scores. However, less attention has been given to conditional SEMs (CSEM), which expresses the variability for scores of individual students. This tutorial summarizes how to obtain CSEMs for each of the three methods, illustrates the concepts on real writing assessment data, and provides computational resources for CSEMs including an example of a specification file for the FACETS program for MFRM and R code to compute CSEMs for G-theory.
Huebner, Alan and Skar, Gustaf B.
"Conditional Standard Error of Measurement: Classical Test Theory, Generalizability Theory and Many-Facet Rasch Measurement with Applications to Writing Assessment,"
Practical Assessment, Research, and Evaluation: Vol. 26
, Article 14.
Available at: https://scholarworks.umass.edu/pare/vol26/iss1/14