Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Equating tests under the Generalized Partial Credit Model

Nonny Swediati, University of Massachusetts Amherst


The efficacy of several procedures for equating tests when the scoring is based on the Partial Credit item response theory was investigated. A simulation study was conducted to investigate the effect of several factors on the accuracy of equating for tests calibrated using the Partial Credit Model. The factors manipulated were the number of anchor items, the difference in the ability distributions of the examinee groups that take alternate forms of a test, the sample size of the groups taking the tests, and the equating method. The data for this study were generated according to the Generalized Partial Credit model. Test lengths of 5 and 20 items were studied. The number of items in the anchor test ranged from two to four for the five item test while the number of anchor items ranged from two to eight items in the twenty item test. Two levels of sample size (500 and 1000) and two levels of ability distribution (equal and unequal) were studied. The equating methods studied were four variations of the Mean and Sigma method and the characteristic curve method. The results showed that the characteristic curve method was the most accurate equating method under all conditions studied. The second most effective method of equating was the Mean and Sigma method which used the all the step difficulty parameter estimates in the computation of the equating constants. In general, all equating methods produced reasonably accurate equating with long tests and with a large number of anchor items when there was no mean difference in ability of the two groups. When there was a large ability difference between the two groups of examinees taking the test, item parameters were estimated poorly, particularly in short tests, and this in turn affected the equating methods adversely. The conclusion is that poor parameter estimation makes it difficult to equate tests which are administered to examinee groups that differ greatly in ability, especially when the tests are relatively short and when the number of anchor items is small.

Subject Area

Educational evaluation

Recommended Citation

Swediati, Nonny, "Equating tests under the Generalized Partial Credit Model" (1997). Doctoral Dissertations Available from Proquest. AAI9809405.