November 13, 2025

Confounding

Outline

Midterm Grades
Recap:
- defining of confounding
- why confounding happens
- causal graphs
Recognizing Confounding
- which additional variables generate confounding?
- what is the direction of bias

Midterms

Confounding

Example:

“BC’s face-mask mandate reduced COVID mortality in the province over the course of the pandemic.”

Fundamental Problem of Causal Inference means we can’t observe BC in the counter-factual world without face-masks.

Example:

Karaivanov et al (2020), economists at SFU, investigate:

Did indoor mask mandates reduced COVID cases, on average? If masks reduce COVID cases, could reduce mortality.

Example:

They compare COVID cases in Ontario Public Health Units (PHU) with and without mask mandates

Not all Ontario PHUs imposed mask mandates at the same time
Can compare PHUs with and without mandates
Correlation of mask mandates and COVID cases across PHUs

Example:

They find that PHUs with mask mandates have slower COVID case growth than PHUs without mask mandates…

Is there any reason to doubt that this correlation is the true causal effect of mask mandates?

Confounding

Correlation suffers from two sources of error:

random error: we observe patterns in \(X\) (independent variable) and \(Y\) (dependent variable) by chance, when there is in fact no relationship.

confounding (bias): the systematic observed correlation between \(X\) and \(Y\) is not the true causal effect of \(X\) on \(Y\).

It’s summer of 2020… the correlation Karaivanov et al use effectively compares…

COVID caseload in PHUs in Toronto area (with mask mandates)…

… with COVID caseload in PHUs like North Bay (without a mask mandate)

Correlation means we “plug in” missing counterfactuals from other cases:

Case	\(\overbrace{\text{Caseload (Mandate)}}^{Y}\)	\(\overbrace{\text{Caseload (No Mandate)}}^{Y}\)	\(\overbrace{\text{Mandate}}^{X}\)
Toronto	\(\mathrm{Cases_{Toronto}(Mandate)}\)	\(\color{red}{\mathrm{Cases_{Toronto}(No \ Mandate)}}\)	Yes
	\(\Downarrow{=}\)	\(\Uparrow{=}\)
North Bay	\(\color{red}{\mathrm{Cases_{North \ Bay}(Mandate)}}\)	\(\mathrm{Cases_{North \ Bay}(No \ Mandate)}\)	No

If this equivalence \((=)\) is false, then confounding (bias)

Why Confounding?

Confounding occurs because:

We use outcome in case B (North Bay) to stand in for counterfactual outcome from case A (Toronto). But outcomes in case B are not the same as in counterfactual case A.

Two ways of thinking about why this produces a bias; unifying idea is that case A and B are different in ways other than X. These other differences affect X and Y.

Confounding and Causal Graphs

Explanation 1: visual

Confounding occurs when there are other differences between cases (call them variables, e.g. \(W\), etc.) that causally affect \(X\) and \(Y\).

The easiest way to understand this is visually.

Causal Graphs

Causal graphs represent a model of the true causal relationships between variables.

the nodes or dots correspond to variables

can be labeled with generic names for independent/dependent variables (\(X\), \(Y\)) or meaningful names (e.g. “Trump Tweets”, “Hate Crimes”)

the arrows convey the direction of the flow of causality

\(X \rightarrow Y\) means that \(X\) causes changes in \(Y\)
\(X \leftarrow W\) means that \(W\) causes changes in \(X\)

Arrows alone do not indicate whether \(X\), e.g., increases or decreases \(Y\).

Causal Graphs

A model of true causal relationships, because:

we don’t know the true causal relationships for sure
graphs permit us to think whether there is confounding, assuming a particular story about what causes what

Causal Graphs

For example

PHUs in Toronto (that had a mask mandate) may more university educated adults than PHUs in North Bay (no mask mandate).

University educated residents might be more likely to work from home \(\xrightarrow{causes}\) mask mandate affects fewer people \(\xrightarrow{causes}\) more likely to mandate masks
University educated residents \(\xrightarrow{causes}\) work from home \(\xrightarrow{causes}\) fewer social contacts \(\xrightarrow{causes}\) lower COVID cases

Causal Graphs

In a causal graph, there is confounding of correlation of \(X\) and \(Y\) if…

any variable \(W\) has a causal path (of any length) toward \(X\) and \(Y\)
(equivalently) there is backdoor path or non-causal path from \(X\) to \(Y\)
- a chain of two or more arrows that follows arrows backwards from of \(X\), changes direction once and follows arrows toward \(Y\): \(X \leftarrow W \leftarrow Z \rightarrow Y\)

Confounding?

Why do backdoor paths produce confounding? Why is it systematic?

Confounding

Why does confounding happen?

Explanation 2:

Confounding happens when cases that experience different levels of \(X\) have systematically different (factual and counterfactual) potential outcomes of \(Y\).

this happens because some other factor \(W\) affects \(X\) and potential outcomes of \(Y\)

Explanation 2: Other differences, Different potential outcomes

Case	\(\overbrace{\text{Caseload (Yes)}}^{Y}\)	\(\overbrace{\text{Caseload (No)}}^{Y}\)	\(\overbrace{\text{Mandate}}^{X}\)	\(\overbrace{\text{Work from Home}}^{W}\)
Toronto	\(\text{Fewer Cases}\)	\(\color{red}{\text{Fewer Cases}}\)	Yes	More
	\(\Updownarrow{\neq}\)	\(\Updownarrow{\neq}\)
North Bay	\(\color{red}{\text{More Cases}}\)	\(\text{More Cases}\)	No	Less

#NotAllVariables produce confounding

Confounding?

No. No “backdoor” path.

Antecedent Variables

antecedent variable: a variable that affects \(X\)

e.g. in this path, \(W \xrightarrow{} X \xrightarrow{} Y\), \(W\) is an antecedent variable.
antecedent variables (\(W\)) do not produce confounding if the only causal path from \(W\) to \(Y\) passes through \(X\).
antecedent variables do produce confounding if there is another causal path from \(W\) to \(Y\) that does NOT include \(X\).

Antecedent Variable: Confounding?

No. No “backdoor” path.

Antecedent Variable: Confounding?

Yes. Mandate \(\xleftarrow{}\) Positives \(\xrightarrow{}\)Stay Home\(\xrightarrow{}\)COVID Cases

Antecedent Variable: Confounding?

No. No “backdoor” path; apparent “backdoor” changes directions more than once.

Confounding?

No backdoor path. Mask mandate affects COVID through mask wearing.

Intervening Variables

intervening variable: a variable that affects \(Y\) and is affected by \(X\).

e.g. in this path, \(X \xrightarrow{} M \xrightarrow{} Y\), \(M\) is an intervening variable.
intervening variables (\(M\)) do not produce confounding because they are on the causal path from \(X\) to \(Y\). They do not produce backdoor path.

Intervening Variable

No backdoor path. Mask mandate affects COVID through mask wearing, avoiding indoor spaces.

Reverse Causality

reverse causality describes the situation where the dependent variable \(Y\) actually causes the independent variable \(X\).

So while we use the correlation to describe the effect of \(X\) on \(Y\): \(X \to Y\), the correlation in fact is the result of the effect of \(Y\) on \(X\): \(Y \to X\).

This is a special case of bias or confounding. (We could also draw this as “third variable”)

	Third Variable?	Key Attribute	Confounding?
Antecedent Variables \((W)\)	Yes	\(W \to X\)	If only causal path from \(W\) to \(Y\) contains \(X\): No If a causal path from \(W\) to \(Y\) excludes \(X\): Yes
Intervening Variables \((M)\)	Yes	\(X \to M \to Y\)	No
Reverse Causality	No	\(Y \to X\)	Yes

Direction of Bias

Example:

Gun violence in Miami Beach has led to state of emergency and curfew during Spring Break.

“We haven’t been able to figure out how to stop spring break from coming,” Mr. Gelber said. “We don’t want spring break here, but they keep coming.”

Does reducing the number of guns reduce firearms violence (deaths)?

Correlation of gun ownership and firearms homicides for US states.

Correlation: Guns and Gun Deaths

Can we conclude gun ownership increase firearms homicides? [hands]

Direction of Bias

Let’s say Miami Beach is considering imposing gun control policies to reduce gun deaths during Spring Break. To make their decision, they look at the correlation on the previous slide…

If that correlation suffered from confounding (bias), how might it affect the policy decision and its consequences…

if the confounding induced an upward bias (observed correlation between gun ownership and firearms deaths more positive than true causal relationship)?

Direction of Bias

Let’s say Miami Beach is considering imposing gun control policies to reduce gun deaths during Spring Break. To make their decision, they look at the correlation on the previous slide…

If that correlation suffered from confounding (bias), how might it affect the policy decision and its consequences…

if the confounding induced a downward bias (observed correlation between gun ownership and firearms deaths more negative than true causal relationship)?

Direction of Bias

Why do we care?

As with measurement bias, we need to apply weak severity principle to judge whether bias would present a problem.

If bias is upward and correlation is positive, true causal effect direction is unknown. \(\to\) Unclear what policy to choose. (Fails weak severity — confounding may lead to positive correlation even if truth is negative causal effect)

If bias is downward and correlation is positive, true causal effect direction is positive. \(\to\) Definitely impose gun restrictions. (confounding makes it harder to find this effect: strong severity)

Direction of Bias

As with measurement bias:

If confounding induces bias in opposite direction of the correlation \(\to\) true causal effect in same direction as correlation (no failure of weak severity)
If confounding induces bias in same direction as the correlation \(\to\) true causal effect may be zero or opposite direction of correlation (possible failure of weak severity)

In situation \((2)\): our conclusions depend on the magnitude or size of the bias induced by confounding. Not a topic for this class.

Direction of Bias

Upward or Downward bias?

Confounding: Direction of Bias

Product of signs on causal path from \(W \to X\) and \(W \to Y\) gives us direction of bias created by confounding

	\(W \xrightarrow{+} X\)	\(W \xrightarrow{-} X\)
\(W \xrightarrow{+} Y\)	\(Correlation(X,Y)\) Biased (+)	\(Correlation(X,Y)\) Biased (-)
\(W \xrightarrow{-} Y\)	\(Correlation(X,Y)\) Biased (-)	\(Correlation(X,Y)\) Biased (+)

\(^*:\) this only works for a single backdoor path; not multiple backdoor paths. Then we would need more information.

Downward bias

Confounding: Direction of Bias

Conclusion

Solutions?

We now know what confounding is and how it arises…

Consider the case of gun-control laws and gun violence::

Can you think of any comparisons we can use that guarantee no other variable \(W\) affects \(X\) (gun ownership) and \(Y\) (gun violence)?

Solutions?

Causal graphs point us to possible solutions:

If something else (\(W\)) changes \(X\) and \(Y\), leading to confounding…

We can use comparisons that hold \(W\) constant.
- but do we know all relevant differences? (all causal paths?)
We can use comparisons that break the connection between \(W\) and \(X\).
- how do we break this connection?

Conclusion

Confounding

bias: observed correlation \(\neq\) true casual relationship
Why?
- cases with different values of \(X\), also differ in \(\text{factual}, \color{red}{\text{counterfactual}}\) values of \(Y\)
- when “third” variable affects \(X\), affects \(Y\) \(\to\) confounding.
Causal graphs help us diagnose possible sources of confounding.
Causal graphs help us diagnose the expected direction of bias (does confounding \(\to\) evidence failing weak severity?)

Causal graphs also point to possible solutions for confounding