A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g., test language) is responsible for the DIF result because there may exist many confounding variables that lead to the DIF result. The present study aims to (i) demonstrate the application of propensity score methods in psychometric research on DIF for day-to-day researchers, and (ii) describe conditional logistic regression for matched data in a DIF context. Propensity score methods can help to achieve the comparability between different populations or groups with respect to participants’ pre-test differences, which can assist in examining the validity of making a causal claim with regard to DIF. Accessed 3,183 times on https://pareonline.net from December 20, 2016 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.