When students receive the same score on a test, does that mean they know the same amount about the topic? The answer to this question is more complex than it may first appear. This paper compares classical and modern test theories in terms of how they estimate student ability. Crucial distinctions between the aims of Rasch Measurement and IRT are highlighted. By modeling a second parameter (item discrimination) and allowing item characteristic curves to cross, as IRT models do, more information is incorporated into the estimate of person ability, but the measurement scale is no longer guaranteed to have the same meaning for all test takers. We explicate the distinctions between approaches and using a simulation in R (code provided) demonstrate that IRT ability estimates for the same individual can vary substantially in ways that are heavily dependent upon the particular sample of people taking a test whereas Rasch person ability estimates are sample-free and test-free under varying conditions. These points are particularly relevant in the context of standards-based assessment and computer adaptive testing where the aim is to be able to say precisely what all individuals know and can do at each level of ability.
Stemler, Steven E. and Naples, Adam
"Rasch Measurement v. Item Response Theory: Knowing When to Cross the Line,"
Practical Assessment, Research, and Evaluation: Vol. 26
, Article 11.
Available at: https://scholarworks.umass.edu/pare/vol26/iss1/11