<- dagify(x ~ u,
example ~ x,
m ~ u + m,
y labels = c("x" = "Smoking",
"y" = "Cancer",
"m" = "Tar",
"u" = "Genotype"),
latent = "u",
exposure = "x",
outcome = "y")
Why read this?
Regret about our actions stems from a counterfactual question: What if I had acted differently?. Therefore, to answer such question, we need a more elaborate language than the one we need to answer prediction or intervention questions. Why? Because we need to compare what happened with what would had happened if we had acted differently. We need to compute the Effect of Treatment on the Treated (ETT).
To compute the ETT, we need to formulate a Structural Causal Model and leverage the invariant qualities across the observed world and the hypothetical world: the unobserved background variables. Indeed, the definition of the Effect of Treatment on the Treated (ETT) is defined for a binary treatment thus:
Of course, we don’t have access to the background variables. In this post, we will learn to answer two questions: when is the ETT identifiable? And if so, can we give an estimator for such counterfactual in terms of non-experimental data?
We will first study a binary treatment and answer both questions. Then, we will tackle the more general case of any treatment.
A Motivating binary example
The following example is taken from Pearl’s (et alter) book Causal Inference in Statistics: A primer.
Imagine an average adolescent: Joe. He has started smoking ever since he began High School. Should he regret his decision? That is, given that he has started smoking, has he significantly increased his chances of suffering from lung cancer compared to his chances had he never begun in the first place?
Therefore, what Joe cares about is the Effect of Treatment on the Treated:
The challenge, thus, is to estimate the counterfactual expression
Expressing ETT in terms of observational data and experimental data
Our treatment is binary. Therefore, let’s begin by using the law of total probability thus to write
We will use the consistency axiom:
Therefore, we can re-write the above expression thus:
In the above expression there’s only one term that cannot be computed using observational data,
By plugging-in this term, we can express our ETT with terms that can be computed with a mix of observational and experimental data.
Therefore, if the treatment is binary, whenever the causal effect of
Going back to Joe
Let’s go back to our motivating example. We can express the ETT using the above derivation:
Thus, we can estimate the ETT with only observational data if we can estimate
Let’s say that our causal DAG for the effects of Smoking on Cancer is the following:
Therefore, we can use the front-door formula to estimate the causal effect of smoking:
Suppose, then, that we collect the following data:
Then, using the front-door criterion, the causal effect
Finally, we can calculate the ETT for Joe:
[]
Therefore, given that
The more general case
Let’s say that our treatment is discrete, but not binary. Is the Effect of Treatment on the Treated (ETT) identifiable? Pearl and Shipster have given an answer to this question using C-components
Identifiability using C-components
Remember, two variables are assigned to the same c-component iff they are connected by a bi-directed path. The c-components themselves induce a factorization of the joint probability distribution in terms of c-factors: post-intervention distribution of the variables in the respective c-component under an intervention on all the other variables.
Just as before the causal effect was identified when
Indeed, whereas before (when we were trying to estimate the causal effect) we summed out
That is, the same test is a sufficient test for causal effects identifiability and both a necessary and sufficient test for ETT identifiability.
Confirming our former result
Let’s take our former example of the causal model of Smoking on Cancer. This time, we will use bi-directed paths to show that there’s an unobserved confounder:
<- dagify(x ~~ y,
example ~ x,
m ~ m) y
tidy_dagitty(example, layout = "nicely", seed = 2) %>%
node_descendants("x") %>%
mutate(linetype = if_else(direction == "->", "solid", "dashed")) %>%
ggplot(aes(x = x, y = y, xend = xend, yend = yend, edge_linetype = linetype, color = descendant)) +
geom_dag_edges(aes(end_cap = ggraph::circle(10, "mm"))) +
geom_dag_point() +
geom_dag_text(col = "white") +
labs(title = "The ETT is identifiable!",
subtitle = "Because there's no bi-directed path between x and m")
Since
However, if it were not binary, we could derive an estimator for the ETT using the induced factorization by the c-components. First, we replace with
Whereas the other c-component is
Therefore, the conditional distribution on
Conditioning on
That is, we replace
Not identifiable
Now let’s work with an example where the causal effect is identifiable, yet the counterfactual query
<- dagify(s ~~ y,
example_not ~~ s,
x ~ z,
x ~ s,
z ~ x
y )
tidy_dagitty(example_not, layout = "nicely", seed = 2) %>%
mutate(linetype = if_else(direction == "->", "solid", "dashed")) %>%
ggplot(aes(x = x, y = y, xend = xend, yend = yend, edge_linetype = linetype)) +
geom_dag_edges(aes(end_cap = ggraph::circle(10, "mm"))) +
geom_dag_point() +
geom_dag_text(col = "white") +
labs(title = "The ETT is not identifiable",
subtitle = "X is connected by a bi-directed path with S")
Conclusions
We’ve seen how regret is logically defined in terms of the Effect of Treatment on the Treated (ETT). We’ve also realized what are the conditions for the ETT to be identifiable and how to derive an estimator for it in terms of observational data.