Omitted Variable Bias in Differential Item Functioning Assessment

Hsiu-Yi Chao, Chi-Chen Chen, Chung-Ping Cheng, Jyun-Hong Chen

Differential item functioning (DIF) assessment has been widely applied for decades to ensure test fairness in routine item analysis. However, few studies have investigated, or even noticed, omitted variable bias (OVB) while assessing DIF. As a result, the estimation of DIF effects may not be unbiased, resulting in inflated type I error rates and/or deflated power rates of DIF assessment. In testing practices, test practitioners may, therefore, wrongly identify inequality among grouping variables and revise the flagged DIF items based on misleading information. To overcome these problems, two issues were addressed in detail in this study. The first issue is the robustness of the original method (i.e., assessing DIF without considering confounding variables) to OVB, which was examined by evaluating the impact of ignoring OVB in DIF assessment. The second issue occurs when the controlled method (i.e., including all grouping variables) encounters the so-called trade-off between bias and inefficiency while assessing DIF. To address this issue, the backward scale purification (BSP) procedure was applied to the controlled method to improve the performance of DIF assessment. Accordingly, three interrelated studies were conducted. In Study 1, type I error rates for the original and controlled methods in DIF assessment were investigated. The results indicated that the controlled method can well control type I error rates under all conditions. In contrast, the original method lost control of type I error rates when confounding variables exhibited DIF and the correlation among grouping variables was high (i.e., greater than or equal to .2). In Study 2, type II error rates of the controlled method were investigated. In comparison to the true model, the results indicated that the type II error rates of the controlled method increased as the number of confounding variables decreased and the correlation among grouping variables increased. This result manifests the trade-off between bias and inefficiency when adding additional variables to the model. In Study 3, the BSP was applied to the controlled method to reduce the type II error rates. The results indicated that BSP can effectively control type I error rates while maintaining acceptable power rates. In summary, the controlled method with BSP appears promising for helping test practitioners deal with OVB in DIF assessment, thereby ensuring fairness and validity in testing practices.