12th January, 2021
David Hume defined a cause “[…] to be an object, followed by another, and where all the objects similar to the first, are followed by objects similar to the second.” (Hume, 1748, section VII)
\[ X \perp_{\mathcal{G}} Y \mid Z \Rightarrow X \perp_{\mathcal{P}} Y \mid Z \enspace . \]
\[ p(X_1, \ldots, X_n) = \prod_{i=1}^n p(X_i \mid \text{pa}^{\mathcal{G}}(X_i)) \enspace , \]
\[ \begin{aligned} p(Y = y \mid do(X = x)) &= p_m(Y = y \mid X = x) \\[0.50em] &= \sum_{z} p_m(Y = y, Z = z \mid X = x) \hspace{5.25em} \text{... Sum rule} \\[0.50em] &= \sum_{z} p_m(Y = y \mid X = x, Z = z) \, p_m(Z = z) \hspace{1.45em} \text{... Product rule} \\[0.50em] &= \sum_{z} p(Y = y \mid X = x, Z = z) \, p(Z = z) \hspace{2.5em} \text{... Assumptions} \end{aligned} \]
\[ p(Y = y \mid do(X = x)) = \sum_{z} p(Y = y \mid X = x, Z = z) \, p(Z = z) \]
\[ \begin{aligned} X &:= \epsilon_X \\[.5em] Y &:= X + \epsilon_Y \\[.5em] Z &:= Y + \epsilon_Z \enspace , \end{aligned} \]
\[ p(X_1, \ldots, X_n) = \prod_{i=1}^n p(X_i \mid \text{pa}^{\mathcal{G}}(X_i)) \hspace{6em} p(X, Y, Z) = p(Z \mid Y) \, p(Y \mid X) \, p(X) \]
set.seed(1) n <- 100 x <- rnorm(n, 0, 1) y <- x + rnorm(n, 0, 1) z <- y + rnorm(n, 0, 0.1)
\[ ACE(Z \rightarrow Y) = \mathbb{E}\left[Y \mid do(Z = z + 1) \right] - \mathbb{E}\left[Y \mid do(Z = z) \right] \enspace . \]
\[ ACE(Z \rightarrow Y) = \mathbb{E}[X + \epsilon_Y] - \mathbb{E}[X + \epsilon_Y] = 0 \enspace . \]
\[ \begin{aligned} ACE(X \rightarrow Y) &= \mathbb{E}[X + 1 + \epsilon_Y] - \mathbb{E}[X + \epsilon_Y] \\[0.50em] &= 1 + \mathbb{E}[X + \epsilon_Y] - \mathbb{E}[X + \epsilon_Y] \\[0.50em] &= 1 \enspace . \end{aligned} \]
\[ X \perp_{\mathcal{G}} Y \mid Z \Rightarrow X \perp_{\mathcal{P}} Y \mid Z \]
\[ X \perp_{\mathcal{P}} Y \mid Z \Rightarrow X \perp_{\mathcal{G}} Y \mid Z \]
Algorithm works as follows:
library('pcalg') set.seed(1) n <- 1000 X <- rnorm(n) Z <- X + rnorm(n) Y <- Z + rnorm(n) suffStat <- list('C' = cor(cbind(X, Y, Z)), 'n' = n) fit <- pc(suffStat = suffStat, indepTest = gaussCItest, p = 3, alpha = 0.01) plot(fit, main = '')
library('pcalg') set.seed(1) n <- 1000 X <- rnorm(n) Y <- rnorm(n) Z <- X + Y + rnorm(n) suffStat <- list('C' = cor(cbind(X, Y, Z)), 'n' = n) fit <- pc(suffStat = suffStat, indepTest = gaussCItest, p = 3, alpha = 0.01) plot(fit, main = '')
In particular, we have that the conditional distribution \(p(Y \mid \text{pa}^{\mathcal{G}}(Y))\) is invariant across \(\mathcal{E}\)
For linear Gaussian SCMs
\[ Y^e := \mu + \sum_{i \, \in \, \text{pa}^{\mathcal{G}}(Y)} \beta_i^e \cdot X_i^e + \varepsilon_Y^e \hspace{3em} \varepsilon_Y^e \perp (X_i, \ldots X_{|\text{pa}^{\mathcal{G}}(Y)|}) \enspace , \]
\[ \begin{aligned} X^1 &:= \varepsilon_{X^1} \\[0.50em] Y^1 &:= X^1 + \varepsilon_{Y^1} \\[0.50em] Z^1 &:= Y^1 + \varepsilon_{Z^1} \\[0.50em] \end{aligned} \]
\[ \begin{aligned} X^2 &:= 2 + \varepsilon_{X^2}\\[0.50em] Y^2 &:= X^2 + \varepsilon_{Y^2} \\[0.50em] Z^2 &:= Y^1 + \varepsilon_{Z^2} \\[0.50em] \end{aligned} \]
library('InvariantCausalPrediction') set.seed(1); n <- 500 X <- c(rnorm(n), 2 + rnorm(n)) Y <- X + rnorm(n) Z <- Y + rnorm(n) X <- cbind(X, Z); colnames(X) <- c('X', 'Z'); indicator <- rep(1:2, each = n) ICP(X, Y, indicator, test = 'exact', showCompletion = FALSE)
## ## accepted set of variables 1 ## accepted set of variables 1,2
## ## Invariant Linear Causal Regression at level 0.01 (including multiplicity correction for the number of variables) ## Variable X shows a significant causal effect ## ## LOWER BOUND UPPER BOUND MAXIMIN EFFECT P-VALUE ## X 0.44 1.04 0.44 0.00048 *** ## Z 0.00 0.52 0.00 1.00000 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
\[ \begin{aligned} X &:= \varepsilon_X \\ Y &:= f(X, \varepsilon_Y) \\ \end{aligned} \]
with \(X \perp \varepsilon_Y\) and \(\varepsilon_X \perp \varepsilon_Y\)
The problem is that there is an equivalent SCM of the form
\[ \begin{aligned} Y &:= \varepsilon_y \\ X &:= g(Y, \varepsilon_X) \\ \end{aligned} \]
library('pcalg') options(scipen = 99, digits = 4) url <- 'https://webdav.tuebingen.mpg.de/cause-effect/' dat <- read.table(paste0(url, 'pair0064.txt'), col.names = c('drinking_water_access', 'infant_mortality')) lingam(dat)$Bpruned
## [,1] [,2] ## [1,] 0.000 0 ## [2,] -1.817 0
library('mgcv') library('dHSIC') options(scipen = 99, digits = 4) url <- 'https://webdav.tuebingen.mpg.de/cause-effect/' dat <- read.table(paste0(url, 'pair0038.txt'), col.names = c('age', 'bmi')) m1 <- gam(bmi ~ s(age), data = dat) m2 <- gam(age ~ s(bmi), data = dat)
library('mgcv') library('dHSIC') options(scipen = 99, digits = 4) url <- 'https://webdav.tuebingen.mpg.de/cause-effect/' dat <- read.table(paste0(url, 'pair0038.txt'), col.names = c('age', 'bmi')) m1 <- gam(bmi ~ s(age), data = dat) m2 <- gam(age ~ s(bmi), data = dat) c( dhsic.test(resid(m1), dat$age, method = 'gamma')$p.value, # Age \perp error dhsic.test(resid(m2), dat$bmi, method = 'gamma')$p.value # BMI \perp error )
## [1] 0.12313742 0.00002698