# Causal Graphs¶

Causal graphs are fairly easy to create in Python. We just need to recall the definition of a *node* and an *edge*. A *node* is a point on the graph (in causal graphs, these represent data); an *edge* is a line that connects two nodes. In causal diagrams, edges are *directed* arrows with a *head* (where the arrow points to) and a *tail* (where the arrow points from).

The `graphviz`

module makes plotting graphs easy. We’ll create a `Digraph`

(short for directed graph).

```
import graphviz
g1 = graphviz.Digraph('G')
g1.node('u')
g1.edge('x', 'y')
g1.edge('u', 'y')
g1
```

Rather than use the `graphviz`

module directly, we can use the `causalgraphicalmodels`

module, which creates a suite of tools that work on top of graphviz diagrams.

To install the module, we need to use the `!pip`

magic command.

```
!pip install causalgraphicalmodels --user
```

```
Requirement already satisfied: causalgraphicalmodels in /home/james/.local/lib/python3.8/site-packages (0.0.4)
Requirement already satisfied: graphviz in /mnt/software/anaconda3/lib/python3.8/site-packages (from causalgraphicalmodels) (0.17)
Requirement already satisfied: pandas in /mnt/software/anaconda3/lib/python3.8/site-packages (from causalgraphicalmodels) (1.2.2)
Requirement already satisfied: networkx in /mnt/software/anaconda3/lib/python3.8/site-packages (from causalgraphicalmodels) (2.5)
Requirement already satisfied: numpy in /mnt/software/anaconda3/lib/python3.8/site-packages (from causalgraphicalmodels) (1.19.2)
```

```
Requirement already satisfied: decorator>=4.3.0 in /mnt/software/anaconda3/lib/python3.8/site-packages (from networkx->causalgraphicalmodels) (4.4.2)
Requirement already satisfied: python-dateutil>=2.7.3 in /mnt/software/anaconda3/lib/python3.8/site-packages (from pandas->causalgraphicalmodels) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /mnt/software/anaconda3/lib/python3.8/site-packages (from pandas->causalgraphicalmodels) (2021.1)
Requirement already satisfied: six>=1.5 in /mnt/software/anaconda3/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas->causalgraphicalmodels) (1.15.0)
```

The syntax of a `causalgraphicalmodels`

graph is slightly different than the `graphiz`

module. Here, we specify a list of nodes as well as a list of directed edges. It is standard to list directed edges as a (tail, head) set of nodes.

```
from causalgraphicalmodels import CausalGraphicalModel
g2 = CausalGraphicalModel(
nodes=["x", "y", "u"],
edges=[
("x", "y"),
("u", "y")
]
)
g2.draw()
```

The advantage of using this module is that it has causal analysis tools built in. We can check for backdoor paths from \(x\) to \(y\) with the `get_all_backdoor_paths()`

function.

```
print(g2.get_all_backdoor_paths("x", "y"))
```

```
[]
```

An empty list indicates that there is no backdoor path! If the causal model is true, then we can estimate a causal effect of \(x\) on \(y\).

Let’s modify the above graph to include a fork \(x \leftarrow u \rightarrow y\).

```
g3 = CausalGraphicalModel(
nodes=["x", "y", "u"],
edges=[
("x", "y"),
("u", "y"),
("u", "x")
]
)
g3.draw()
```

The above graph has a backdoor path \(x \leftarrow u \rightarrow y\).

```
print(g3.get_all_backdoor_paths("x", "y"))
```

```
[['x', 'u', 'y']]
```

Let’s reverse the position of \(x\) and \(u\) such that \(x\) now sits in the middle of the fork.

```
g4 = CausalGraphicalModel(
nodes=["x", "y", "u"],
edges=[
("x", "y"),
("u", "y"),
("x", "u")
]
)
g4.draw()
```

Here, \(x\) is correlated with the unobserved data \(u\). But note that in this model \(x\) *causes* \(u\) rather than is influenced by \(u\).

There is no backdoor path from \(x\) to \(y\).

```
print(g4.get_all_backdoor_paths("x", "y"))
```

```
[]
```

With \(u\) unobserved, the effect of \(x\) on \(y\) that we will estimate under this model (if it is true) is the *total* causal effect of \(x\) on \(y\). If \(u\) were observable, then we could get both the direct effect of \(x \rightarrow y\) of \(x\) as well as the moderated \(x \rightarrow u \rightarrow y\) effect.

We’ll review these graphs after learning how to perform regression.

**Concept check**: replicate the graph in the code cell below.

You should have edges \(x \rightarrow z, z \rightarrow y, u \rightarrow x, u \rightarrow y\).

```
import graphviz
g5 = graphviz.Digraph('G', format='svg')
g5.edge('x', 'z')
g5.edge('z', 'y')
g5.edge('u', 'y')
g5.edge('u', 'x')
g5
```