Doctoral Dissertations
Date of Award
8-2021
Degree Type
Dissertation
Degree Name
Doctor of Philosophy
Major
Educational Psychology and Research
Major Professor
Louis M. Rocconi
Committee Members
Louis M. Rocconi, Jennifer A. Morrow, R. Steve McCallum, William Nugent
Abstract
Construct validity is necessary to confirm psychometric instruments measure their intended construct and limit measurement error. Differential item functioning (DIF) helps to identify bias and error within measurement instruments and support construct validity. DIF detection procedures have been thoroughly scrutinized under several item response theory models (e.g., Bulut & Suh, 2017; Cohen et al., 1996; Suh & Cho, 2014; Elosua & Wells, 2013; Wang & Shih, 2010; Woods & Grimm, 2011), but no study has explored DIF detection procedures for data fitted to polytomous multidimensional item response theory models. This study aims to identify the optimal DIF procedures for data that fits the multidimensional graded response model.
Comparisons were made on three statistical and psychometric DIF detection procedures, the multidimensional IRT likelihood ratio (MIRT-LR) test, the multidimensional extension of the logistic discriminant function analysis (MLDFA) method, and the multidimensional multiple causes, multiple indicators interaction (MIMIC-interaction) model. Multidimensional graded response data were generated for twenty items with five response options through a Monte Carlo simulation with varied constraints on sample size, DIF type, percentage of DIF, correlations between latent traits, and latent mean differences between groups to determine the effect of the three DIF detection methods on type I error and rejection rates.
Results indicated type I error rates were inflated (i.e., greater than .05) for all three DIF detection procedures. The MLDFA method produced the highest rejection rates, but also displayed the type I error rates between .06 and .15. The MIRT-LR test indicated poor ability to detect nonuniform DIF and greatly inflated type I error rates when latent mean differences were unbalanced. The MIMIC-interaction model exhibited the lowest type I error rates and adequate rejection rates, indicating strong statistical power under most conditions. General method selection recommendations are provided for psychometric and assessment professionals. Future research for DIF detection procedure for polytomous multidimensional item response theory models is also discussed.
Recommended Citation
Walker, John, "Comparison of Differential Item Functioning Detection Procedures under the Multidimensional Graded Response Model Framework. " PhD diss., University of Tennessee, 2021.
https://trace.tennessee.edu/utk_graddiss/6496