POLI 110: Confounding

December 2, 2024

Correlation to Causation

Solutions to Confounding

Recap
Before/After Comparisons

Recap

Solutions to Confounding

Every way of using correlation as evidence for causality makes assumptions

FPCI cannot be solved without assumptions
With assumptions, can say confounding/bias is not a problem

Solution	How Bias Solved	Which Bias Removed	Assumes	Internal Validity	External Validity
Experiment	Randomization Breaks \(W \rightarrow X\) link	All confounding variables	1. \(X\) is random 2. Change only \(X\)	High	Low
Conditioning	Hold confounders constant	Only confounders conditioned on	1. Condition on all confounders 2. Low measurement error 3. Cases similar in \(W\)	Low	High

Conditioning

**Did Trump rallies increase hate crimes?**

Feinberg, Branton, and Martinez-Ebers compare hate crimes in counties with and without Trump rallies, condition on (hold constant):

percent Jewish, number of hate groups, crime rate, 2012 Republican vote share, percent university educated, region

But they left out population, which confounded Trump rallies and Hate Crimes.
Difficult to find counties without rallies similar in many traits to counties with rallies

After conditioning on population (a confounder): no correlation.

Limits of Conditioning:

**Did Trump rallies increase hate crimes?**

If we know confounding variables, can we find cases with and without rallies that are the same on many confounding variables?
If we don’t know or can’t measure confounding variables, may still be differences between places with and without rallies that produce confounding.

What if we compare counties before and after rallies?

Before and After

Example: Before and After

Taking the same data from Feinberg, Branton, and Martinez-Ebers…

we focus only on counties that ever had a Trump (Clinton) rally
compare the month after the rally to the month before the rally

Before and After

DISCUSS

If we compare counties to themselves before and after rallies…

Which confounding variables are held constant?
What are confounding variables that might NOT be addressed in this comparison?

What kinds of confounding variables are held constant in this before/after comparison?

Break

Design Based Solutions

Conditioning removes confounding by:

identifying possible confounding variables
measuring confounding variables
examine relationship b/t \(X\) and \(Y\) for cases with similar value of confounding variables \(W\).

Design Based Solutions

Design-based solutions remove confounding by:

selecting cases for comparison in order to eliminate many known/unknown as well as measurable/unmeasurable confounding variables.
the nature of the comparison holds constant classes of confounding variables, not specific confounding variables.
by a “class” we mean all confounding variables that share specific properties (e.g., unchanging over time)

Example: Before and After

Which of these possible confounders are held constant in a before-and-after comparison (month after vs month before rally)?

Example: Before and After

Confounding Solved?…

All confounding variables (affect whether a rally occurs; affect hate crimes) that are unchanging over time are held constant.

because held constant, cannot produce confounding
e.g., demographic features, political leaning, location/geography, long-term economic trends, 8chan white nationalists
any variable that does not change in the time period of the comparison (in this case, two months) held constant
does not matter if we can think of or even measure the confounders

Design Based Solutions

Before and after comparisons are design based, because…

they hold constant ALL unchanging confounding variables (a class of confounding variables).
contrast to conditioning, only blocks specific variables/paths

And like all solutions to confounding: they make an assumption

Design Based Solutions

Just like experiments and confounding, Before and After comparisons plug in for MISSING potential outcomes.

County	Time	\(Y:\) HC(Yes)	\(Y:\) HC(No)	\(X:\) Rally
\(c\)	Before	\(\color{red}{\text{Hate Crimes}_{c,Before}[\text{Rally}]}\)	\(\color{black}{\text{Hate Crimes}_{c,Before}[\text{No Rally}]}\)	No
			\(\Downarrow\)
\(c\)	After	\(\color{black}{\text{Hate Crimes}_{c,After}[\text{Rally}]}\)	\(\color{red}{\text{Hate Crimes}_{c,After}[\text{No Rally}]}\)	Yes

Design Based Solutions

County	Time	\(Y:\) HC(Yes)	\(Y:\) HC(No)	\(X:\) Rally
\(c\)	Before	\(\color{red}{\text{Hate Crimes}_{c,Before}[\text{Rally}]}\)	\(\color{black}{\text{Hate Crimes}_{c,Before}[\text{No Rally}]}\)	No
			\(\Downarrow\)
\(c\)	After	\(\color{black}{\text{Hate Crimes}_{c,After}[\text{Rally}]}\)	\(\boxed{\color{black}{\text{Hate Crimes}_{c,Before}[\text{No Rally}]}}\)	Yes

We assume \(\color{red}{\text{Hate Crimes}_{c,After}[\text{No Rally}]} = \\ \color{black}{\text{Hate Crimes}_{c,Before}[\text{No Rally}]}\)

That is: if \(X\) had not changed, \(Y\) would not have changed.

Before and After

Assumptions:

assume that counterfactual potential outcomes of \(Y\) without \(X\) after \(X\) happens, same as factual \(Y\) without \(X\) before \(X\) happens
equivalently: assume there are no variables \(W\) that affect \(Y\) and change over time with \(X\).

Any \(W\) that affects \(Y\) and changes with \(X\) will produce confounding even if it does not cause \(X\).

this is a new problem

Example: Before and After

When does this assumption fail?

Did Trump rallies take place in places that are already trending toward having more hate crimes?
Is it possible that Trump wanted to avoid controversy and waited to hold rallies in places until they had a month with a lower-than-usual number of hate crimes? (board)

We can address these concerns by looking at longer-term trends…

Mostly constant upward trend; no change when rallies occur

Example: Before and After

When does this assumption fail?

Over-time comparison, we can create confounding from variables that do not cause \(X\) to change, if they also change with \(X\) over time…

Does rally change measurement, but not actual number of hate crimes? (Measurement bias)
- we can examine this concern by measuring hate crimes in other ways.
Are there are other changes over the same time-frame (change at the same time as \(X\), rallies)?
- This is harder to solve. In the same situation as conditioning.
- Less of a problem when comparing across very short time periods (fewer variables change)

Example: Before and After

It may be that the effects are due to changes in measurement: Anti-Defamation League vs. FBI Hate Crimes give different results.

Another Example:

Why have real wages stayed stagnant?

Another Example:

Starting in the 1980s, automation via robotics/software started to grow.

Another Example:

From “before” to “after” growth of automation, we see slowing or even reversal of growth in real wages:

automation \(\to\) inequality, worse job / health outcomes

Can we reasonably conclude the machines are to blame?

Design: Before and After

What is it?

Compare the same case to itself before and after change in \(X\)

How does it work?

Holds constant all unchanging attributes of the case.

any confounding variables that do not change over time cannot produce change in \(Y\) with change in \(X\)

Before and After: Assumptions

In order to infer \(X\) causes \(Y\) if \(X,Y\) correlated in before/after comparison

Must Assume

There are no other variables that affect \(Y\) and change with \(X\) over time (may be a causal or non-causal link of \(W\) and \(X\))

Before and After: Limitations

This assumption can be violated if…

other variables that affect \(Y\) change with \(X\) over time.
Value of \(Y\) in cases has a long-term trend in one direction
\(X\) changes in response to extreme changes in \(Y\) (e.g. gun laws respond to uptick in gun crimes)
\(X\) changes measurement of \(Y\)

Solution	How Bias Solved	Which Bias Removed	Assumes	Internal Validity	External Validity
Experiment	Randomization Breaks \(W \rightarrow X\) link	All confounding variables	1. \(X\) is random 2. Change only \(X\)	Highest	Lowest
Conditioning	Hold confounders constant	Only variables conditioned on	1. Condition on all confounders 2. Low measurement error 3. Cases same in \(W\)	Lowest	Highest
Before and After	Hold confounders constant	variables unchanging over time	1. causes of \(Y\) do not change w/ \(X\)	Lower	Higher

Correlation to Causation

Solutions to Confounding

Recap

Solutions to Confounding

Conditioning

Did Trump rallies increase hate crimes?

Limits of Conditioning:

Did Trump rallies increase hate crimes?

Before and After

Example: Before and After

Before and After

Break

Design Based Solutions

Design Based Solutions

Design Based Solutions

Example: Before and After

Example: Before and After

Example: Before and After

Example: Before and After

Confounding Solved?…

Design Based Solutions

Design Based Solutions

Design Based Solutions

Before and After

Assumptions:

Example: Before and After

When does this assumption fail?

Example: Before and After

When does this assumption fail?

Example: Before and After

Another Example:

Another Example:

Another Example:

Design: Before and After

What is it?

How does it work?

Before and After: Assumptions

Must Assume

Before and After: Limitations

Alternatives?

**Did Trump rallies increase hate crimes?**

**Did Trump rallies increase hate crimes?**