Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Date of Award

5-2011

Access Type

Campus Access

Document type

dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Research and Evaluating Methods

First Advisor

Ronald K. Hambleton

Second Advisor

Stephen G. Sireci

Third Advisor

Craig S. Wells

Subject Categories

Educational Assessment, Evaluation, and Research

Abstract

This computer simulation study was designed to comprehensively investigate how formative test designs can capitalize on the dimensional structure among the proficiencies being measured in a test, item selection methods, and computerized adaptive testing to improve measurement precision and classification accuracy. Four variables were manipulated to investigate the effectiveness of multidimensional adaptive testing (MAT): Number of dimensions measured by the test, magnitude of the correlations among the dimensions, the item selection method, and the test design. Outcome measures included recovery of known proficiency scores, bias in estimation, and accuracy of proficiency classifications.

Unlike previous MAT research, no significant effect was found on the outcome measures due to the number of dimensions. A moderate improvement in the outcome measures was found with higher correlations (e.g., .50 or .80) among the dimensions.

Four different item selection methods--Bayesian, Fisher, optimal, and random--were applied to evaluate the measurement efficiency of adaptive item selection methods and non-adaptive methods. As a baseline, the findings from the item selection method using random selection were available. The Bayesian item selection method showed the best results under different conditions. The Fisher item selection method showed the second best results, but the gap among adaptive item selection methods was reduced with longer tests and higher correlations among the dimensions. The optimal item selection method produced comparable results to adaptive item selection methods, when the focus was on the accuracy of decision making which in many applications of diagnostic assessment is the most important criterion. The level of impact of increased test length with a fixed test length condition was apparent on all of the outcome measures.

The results from the study suggest that the Bayesian item selection method can be quite useful when there are at least moderate correlations among the dimensions. As these results were obtained using a good estimate of the priors, in a next step, the impact of poor prior (i.e., inaccurate) information on the validity of the Bayesian approach (e.g., too high, too low, too tight) should be investigated. We note too the very good results obtained with optimal item selection when the focus was on accuracy of proficiency classifications.

DOI

https://doi.org/10.7275/5683340

Share

COinS