Thumbnail Image

Examination of the Application of Item Response Theory to the Angoff Standard Setting Procedure

Establishing valid and reliable passing scores is a vital activity for any examination used to make classification decisions. Although there are many different approaches to setting passing scores, this thesis is focused specifically on the Angoff standard setting method. The Angoff method is a test-centric classical test theory based approach to estimating performance standards. In the Angoff method each judge estimates the proportion of minimally competent examinees who will answer each item correctly. These values are summed across items and averages across judges to arrive at a recommended passing score. Unfortunately, research has shown that the Angoff method has a number of limitations which have the potential to undermine both the validity and reliability of the resulting standard. Many of the limitations of the Angoff method can be linked to its grounding in classical test theory. The purpose of this study is to determine if the limitations of the Angoff could be mitigated by a transition to an item response theory (IRT) framework. Item response theory is a modern measurement model for relating examinees' latent ability to their observed test performance. Theoretically the transition to an IRT-based Angoff method could result in more accurate, stable, and efficient passing scores. The methodology for the study was divided into three studies designed to assess the potential advantages of using an IRT-based Angoff method. Study one examined the effect of allowing judges to skip unfamiliar items during the ratings process. The goal of this study was to detect if passing scores are artificially biased due to deficits in the content experts' specific item level content knowledge. Study two explored the potential benefit of setting passing scores on an adaptively selected subset of test items. This study attempted to leverage IRT's score invariance property to more efficiently estimate passing scores. Finally study three compared IRT-based standards to traditional Angoff standards using a simulation study. The goal of this study was to determine if passing scores set using the IRT Angoff method had greater stability and accuracy than those set using the common True Score Angoff method. Together these three studies examined the potential advantages of an IRT-based approach to setting passing scores. The results indicate that the IRT Angoff method does not produce more reliable passing score than the common Angoff method. The transition to the IRT-based approach, however, does effectively ameliorate two sources of systematic error in the common Angoff method. The first source of error is brought on by requiring that all judges rate all items and the second source is introduced during the transition from test to scaled score passing scores. By eliminating these sources of error the IRT-based method allows for accurate and unbiased estimation of the judges' true opinion of the ability of the minimally capable examinee. Although all of the theoretical benefits of the IRT Angoff method could not be demonstrated empirically, the results of this thesis are extremely encouraging. The IRT Angoff method was shown to eliminate two sources of systematic error resulting in more accurate passing scores. In addition this thesis provides a strong foundation for a variety of studies with the potential to aid in the selection, training, and evaluation of content experts. Overall findings from this thesis suggest that the application of IRT to the Angoff standard setting method has the potential to offer significantly more valid passing scores.
Research Projects
Organizational Units
Journal Issue
Publisher Version
Embedded videos