Biometrika Advance Access originally published online on February 4, 2008
Biometrika 2008 95(1):17-33; doi:10.1093/biomet/asm092
Articles |
Distortion of effects caused by indirect confounding
Department of Mathematical Statistics, Chalmers/Göteborgs Universitet, Gothenburg, Sweden wermuth{at}math.chalmers.se
Nuffield College, Oxford OX1 1NF, U.K. david.cox{at}nuffield.ox.ac.uk
Received for publication 1 April 2006. Revision received 1 June 2007.
Undetected confounding may severely distort the effect of an explanatory variable on a response variable, as defined by a stepwise data-generating process. The best known type of distortion, which we call direct confounding, arises from an unobserved explanatory variable common to a response and its main explanatory variable of interest. It is relevant mainly for observational studies, since it is avoided by successful randomization. By contrast, indirect confounding, which we identify in this paper, is an issue also for intervention studies. For general stepwise-generating processes, we provide matrix and graphical criteria to decide which types of distortion may be present, when they are absent and how they are avoided. We then turn to linear systems without other types of distortion, but with indirect confounding. For such systems, the magnitude of distortion in a least-squares regression coefficient is derived and shown to be estimable, so that it becomes possible to recover the effect of the generating process from the distorted coefficient.
Key Words: Graphical Markov model Identification Independence graph Linear least-squares regression Parameter equivalence Recursive regression graph Structural equation model Triangular system
References
-
Angrist J. D., Krueger A. B. Instrumental variables and the search for identification: From supply and demand to natural experiments. J. Econ. Perspect. (2001) 15:65–89.
Brito C., Pearl J. A new identification condition for recursive models with correlated errors. Struct. Equ. Model. (2002) 9:459–74.[CrossRef]
Cochran W. G. The omission or addition of an independent variate in multiple linear regression. J. R. Statist. Soc. Suppl. (1938) 5:171–6.[CrossRef]
Cox D. R., Wermuth N. Linear dependencies represented by chain graphs (with Discussion). In: Statist. Sci. (1993) 8:204–18, 247–77.[CrossRef]
Cox D. R., Wermuth N. Multivariate Dependencies: Models, Analysis, and Interpretation. (1996) London: Chapman and Hall.
Cox D. R., Wermuth N. A general condition for avoiding effect reversal after marginalization. J. R. Statist. Soc. B (2003) 56:934–40.
Cramér H. Mathematical Methods of Statistics. (1946) Princeton, NJ: Princeton University Press.
Edwards D. Introduction to Graphical Modeling (2000) 2nd ed. New York: Springer.
Goldberger A. S. A Course in Econometrics. (1991) Cambridge, MA: Harvard University Press.
Hardt J., Petrak F., Filipas D., Egle U. T. Adaption to life after surgical removal of the bladder – An application of graphical Markov models for analysing longitudinal data. Statist. Med. (2004) 23:649–66.[CrossRef]
Hausman J. A. Instrumental variable estimation. In: Encyclopedia of Statistical Sciences—Kotz S., Johnson N. L., Read C. B., eds. (1983) 4. New York: Wiley. 150–3.
Kiiveri H. T. An incomplete data approach to the analysis of covariance structures. Psychometrika (1987) 52:539–54.[CrossRef][ISI]
Kiiveri H. T., Speed T. P., Carlin J. B. Recursive causal models. J. Aust. Math. Soc. A (1984) 36:30–52.
Lauritzen S. L. Graphical Models (1996) Oxford: Oxford University Press.
Lauritzen S. L., Wermuth N. Graphical models for associations between variables, some of which are qualitative and some quantitative. Ann. Statist. (1989) 17:31–54.[CrossRef]
Ma Z., Xie X., Geng Z. Collapsibility of distribution dependence. J. R. Statist. Soc. B (2006) 68:127–33.[CrossRef]
Marchetti G. M. Independencies induced from a graphical Markov model after marginalization and conditioning: The R package ggm. J. Statist. Software (2006) 15(issue 6).
Robins J., Wasserman L. Estimation of effects of sequential treatments by reparametrizing directed acyclic graphs. In: Proc. 13th Annual Conf. Uncertainty Artificial Intelligence—Geiger D., Shenoy O., eds. (1997) San Francisco, CA: Morgan and Kaufmann. 409–20.
Sargan J. D. The estimation of economic relationships using instrumental variables. Econometrica (1958) 26:393–415.[Medline]
Stanghellini E., Wermuth N. On the identification of path analysis models with one hidden variable. Biometrika (2005) 92:337–50.
Tukey J. W. Causation, regression, and path analysis. In: Statistics and Mathematics in Biology—Kempthorne O., Bancroft T. A., Gowen J. W., Lush J. L., eds. (1954) Ames: The Iowa State College Press. 35–66.
Wermuth N. Linear recursive equations, covariance selection, and path analysis. J. Am. Statist. Assoc. (1980) 75:963–97.[CrossRef][ISI]
Wermuth N. Graphical chain models. In: Encyclopedia of Behavioral Statistics, II—Everitt B., Howell David C., eds. (2005) Chichester: Wiley. 755–7.
Wermuth N., Cox D. R. On association models defined over independence graphs. In: Bernoulli (1998) 4:477–95.[CrossRef]
Wermuth N., Cox D. R. Statistical dependence and independence. Encyclopedia of Biostatistics—Armitage P., Colton T., eds. (1998) New York: Wiley. 4260–7.
Wermuth N., Cox D. R. Joint response graphs and separation induced by triangular systems. J. R. Statist. Soc. B (2004) 66:687–717.[CrossRef]
Wermuth N., Cox D. R., Marchetti G. Covariance chains. Bernoulli (2006) 12:841–62.[ISI]
Wermuth N., Wiedenbeck M., Cox D. R. Partial inversion for linear systems and partial closure of independence graphs. BIT, Numer. Math. (2006) 46:883–901.[CrossRef]
Wold H. O. Causality and econometrics. Econometrica (1954) 22:162–77.[Medline]
Wright S. The theory of path coefficients: A reply to Niles' criticism. Genetics (1923) 8:239–55.
Wright S. The method of path coefficients. Ann. Math. Statist. (1934) 5:161–215.[CrossRef]
| ||||||||||||||||||||||||||||||||||||||||||||||||