November 27, 2024
We want to use correlation to provide evidence of causation:
Every way of using correlation as evidence for causality makes trade-off between:
Solution | How Bias Solved |
Which Bias Removed |
Assumes | Internal Validity |
External Validity |
---|---|---|---|---|---|
Experiment | Randomization Breaks \(W \rightarrow X\) link |
All confounding variables | \(X\) is random Change only \(X\) |
High | Low |
Conditioning | Hold confounders constant |
? | ? | Low | High |
when we observe \(X\) and \(Y\) for multiple cases, we examine the correlation of \(X\) and \(Y\) within groups of cases that are the same on confounding variables \(W, etc. \ldots\)
How does conditioning solve the problem?
Feinberg, Branton, and Martinez-Ebers compare hate crimes in counties with and without Trump rallies, but condition on (hold constant):
County | HC(Yes) Y |
HC(No) Y |
Rally (X) | Jewish % |
Hate Groups |
Crime Rate |
Rep. % |
Univ. % |
Region |
---|---|---|---|---|---|---|---|---|---|
a | \(More\) | \(\color{red}{Fewer}\) | Yes | 2 | 3 | 15 | 53 | 38 | South |
\(\Downarrow\) | \(\Uparrow\) | ||||||||
b | \(\color{red}{More}\) | \(Fewer\) | No | 2 | 3 | 15 | 53 | 38 | South |
Feinberg, Branton, and Martinez-Ebers find that, even after conditioning, Trump rallies increase the risk of hate crimes by 200%!
Economics PhD Candidates show that conditioning on the same variables…
County | HC(Yes) Y |
HC(No) Y |
Rally (X) | Jewish % |
Hate Groups |
Crime Rate |
Rep. % |
Univ. % |
Region | Pop. |
---|---|---|---|---|---|---|---|---|---|---|
a | \(More\) | \(\color{red}{More}\) | Yes | 2 | 3 | 15 | 53 | 38 | South | High |
\(\not\Downarrow\) | \(\not\Uparrow\) | |||||||||
b | \(\color{red}{Fewer}\) | \(Fewer\) | No | 2 | 3 | 15 | 53 | 38 | South | Low |
We would be wrong to use observed hate crimes in county \(b\) (without a rally) to substitute in for counterfactual hate crimes in county \(a\) (without a rally).
Population differences \(\to\) difference in hate crimes regardless of rally.
After also conditioning on population (a confounder): no correlation.
In order to use conditioning to infer \(X\) causes \(Y\) if \(X,Y\) correlated …
How can we tell whether this assumption is correct?
DISCUSS
In wake of mass shootings, we might ask:
Do mass shooting events cause people to become more supportive of stricter gun control policies?
Newman and Hartman (2017) examine whether exposure to mass shootings cause an increase in support for stricter gun control
Which variables do we NEED to condition on? Which variables do we NOT NEED to condition on?
Must Assume
But…
Which variables do we NEED to condition on? Which variables do we NOT NEED to condition on?
Must Assume
But…
Compare proximity to mass shooting and gun control attitudes, conditioning on (holding constant:
Proximity to mass shootings increases support for stricter gun laws… assuming no other confounders
Must Assume
But…
In order to infer \(X\) causes \(Y\) if \(X,Y\) correlated after conditioning
Imagine we want to condition on gun ownership, when examining correlation of mass shootings and gun attitudes.
What if we measure gun ownership with random measurement error?
Let’s see what happens… BOARD
In order to infer \(X\) causes \(Y\) if \(X,Y\) correlated after conditioning
Let’s say we want to examine the effect of gun laws on gun violence across countries:
What are some factors that might affect strictness of gun regulation and gun violence?
Is there, e.g., country that is similar to the US on all confounders?
In order to infer \(X\) causes \(Y\) if \(X,Y\) correlated after conditioning
Solution | How Bias Solved |
Which Bias Removed |
Assumes | Internal Validity |
External Validity |
---|---|---|---|---|---|
Experiment | Randomization Breaks \(W \rightarrow X\) link |
All confounding variables | \(X\) is random Change only \(X\) |
High | Low |
Conditioning | Hold confounders constant |
Only variables conditioned on |
Condition on all confounders; Low measurement error; Have similar cases |
Low | High |
Conditioning:
If we know confounding variables, can we find cases with and without rallies that are the same on many confounding variables?
If we don’t know or can’t measure confounding variables, may still be differences between places with and without rallies that produce confounding.
We can make Before and After comparisons:
Taking the same data from Feinberg, Branton, and Martinez-Ebers…
DISCUSS
If we compare counties to themselves before and after rallies…
Which confounding variables are held constant?
What are confounding variables that might NOT be addressed in this comparison?
Conditioning removes confounding by:
Design-based solutions remove confounding by:
Which of these possible confounders are held constant in a before-and-after comparison?
All confounding variables (affect whether a rally occurs; affect hate crimes) that are unchanging over time (before to) are held constant.