Journal of the Korean Statistical Society
Researchers want to know whether the change in an explanatory variable X affects the change in a response variable Y (i.e., X causes Y). In practice, there can be two causal paths from X to Y, the path through a mediating variable M (indirect effect) and the path not through M (direct effect). The parameter estimation and hypothesis testing can be performed by a regression-based mediation model. It is already known that randomization of X is not enough for unbiased estimation, and the bias due to an unobserved variable has been discussed in literature but often overlooked. In this article, we first review the challenge under a simple mediation model, then we provide a formula for the exact bias due to an unobserved precursor variable W, the variable which potentially causes the changes in X, M, and/or Y. We present simulation studies to demonstrate the impact of an unobserved precursor variable on hypothesis testing for indirect effect and direct effect. The simulation results show that the inflation of type I error is serious particularly in a large sample study. To numerically demonstrate the formula of the exact bias, a popular data set published in a journal of statistics education is revisited, and we quantify why the conclusion of data analysis can be different before and after accounting for the precursor variable. The result shall remind the importance of a precursor variable in mediation analysis.
Kim, Steven B. and Lee, Joonghak, "Regression-based mediation analysis: a formula for the bias due to an unobserved precursor variable" (2021). Mathematics and Statistics Faculty Publications and Presentations. 18.