A comparison of two different methods for setting performance standards for a test with constructed-response items

DOI

https://doi.org/10.7275/bhb9-8t88

Abstract

The trustworthiness of performance standards influences the credibility of criterion-referenced large-scale testing. In this paper, two standard-setting methods are evaluated and compared, when applied to a test with polytomously scored constructed-response items. A version of the Angoff method is chosen as representative of the class of test-centred standard-setting procedures and the borderline-group method represents the class of examinee-centred procedures. The evaluation is based on procedural, internal and external evidence. The results indicate that both methods provide reasonable and trustworthy approaches to standard setting, but also confirm some of the potential problems with these methods.Accessed 23,651 times on https://pareonline.net from September 15, 2008 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Näsström, Gunilla and Nyström, Peter (2019) "A comparison of two different methods for setting performance standards for a test with constructed-response items," Practical Assessment, Research, and Evaluation: Vol. 13, Article 9.
DOI: https://doi.org/10.7275/bhb9-8t88
Available at: https://scholarworks.umass.edu/pare/vol13/iss1/9

Link to Full Text

COinS

A comparison of two different methods for setting performance standards for a test with constructed-response items

Authors

DOI

Abstract

Creative Commons License

Recommended Citation

Share