December 4, 2024

Correlation to Causation

Solutions to Confounding

  1. Recap
    • Before and After
  2. Differences in Differences
    • What is it?
    • How does it work?
    • Assumptions?
    • Example

Recap

Solutions to Confounding

Every way of using correlation as evidence for causality makes assumptions

  • FPCI cannot be solved without assumptions
  • With assumptions, can say confounding/bias is not a problem

Solution How Bias
Solved
Which Bias
Removed
Assumes Internal
Validity
External
Validity
Experiment Randomization
Breaks \(W \rightarrow X\) link
All confounding variables 1. \(X\) is random
2. Change only \(X\)
Highest Lowest
Conditioning Hold confounders
constant
Only variables
conditioned on
1. Condition on all confounders
2. Low measurement error
3. Cases similar in \(W\)
Lowest Highest
Before and After Hold confounders
constant
variables
unchanging
over time
No causes of \(Y\)
change w/ \(X\)
Lower Higher

Example: Gun Laws

Does easing restrictions on gun laws increase murders committed using guns?

  • Some states in the US require all handgun purchasers to acquire a permit-to-purchase (PTP) license.
  • Only persons with a permit may purchase firearms
  • In late 2007, Missouri eliminated its PTP requirement

Example: Gun Laws

Webster et al (2014) investigate:

  • Did the removal of the PTP law increase firearms homicides in Missouri?
  • Conditioning?: Lots of unique features of Missouri; no “otherwise similar” state.

Did the repeal CAUSE a change in murders using guns?

Example: Gun Laws

Holds all unique, unchanging characteristics of Missouri constant…

Example: Gun Laws

But, we have to assume that there is nothing else about Missouri that

  1. changed around the same time as the PTP (gun control) repeal
  2. and affected Firearms Homicides

(or more technically, assume that \(\color{red}{\text{Murders}_{MO,After}[\text{No Repeal}]} = \color{black}{\text{Murders}_{MO,Before}[\text{No Repeal}]}\))

No long-term trends, no effects on measurement, no changes in crimes \(\to\) PTP repeal

Does this plot make it easier/harder to believe PTP repeal caused more murders? (DISCUSS)

Example: Gun Laws

Could be that other things were changing between 2007-2008 that confound relationship between PTP and Murders?

  • Maybe an upward trend in long term?
  • Maybe 2006-2007 was aberration, 2008 a return to trend?
  • Did anything else happen in 2008?

Example: Gun Laws

State Time Murder(Yes) Murder(No) Repeal
Missouri Before \(\color{red}{\text{Murders}_{MO,Before}[\text{Repeal}]}\) \(\color{black}{\text{Murders}_{MO,Before}[\text{No Repeal}]}\) No
\(\neq\not\Downarrow\)
Missouri After \(\color{black}{\text{Murders}_{MO,After}[\text{Repeal}]}\) \(\color{red}{\text{Murders}_{MO,After}[\text{No Repeal}]}\) Yes


It appears that \(\color{red}{\text{Murders}_{MO,After}[\text{No Repeal}]} \neq \\ \color{black}{\text{Murders}_{MO,Before}[\text{No Repeal}]}\)

Because other factors changing murders, regardless of repeal

Example: Gun Laws

What can we do to remove confounding from other variables that change over time, like…

  • weather patterns (hot weather \(\to\) murders)
  • global financial crises/economic shocks
  • political events
  • Another way to put this question is: what would the trend in gun murders in Missouri have been had there been no PTP repeal? What was the counterfactual trend?

Example: Gun Laws

We want to compare the actual trend in Missouri:

\(\begin{equation}\begin{split}\text{Trend}_{MO} ={} & \color{black}{\text{Murders}_{MO,After}[\text{Repeal}]} - \\ & \color{black}{\text{Murders}_{MO,Before}[\text{No Repeal}]}\end{split}\end{equation}\)

against the counterfactual trend in Missouri:

\(\begin{equation}\begin{split}\color{red}{\text{CF Trend}_{MO}} ={} & \color{red}{\text{Murders}_{MO,After}[\text{No Repeal}]} - \\ & \color{black}{\text{Murders}_{MO,Before}[\text{No Repeal}]}\end{split}\end{equation}\)

\(\small{\begin{equation}\begin{split} = {} & \overbrace{\{\text{Murders}_{MO,After}(\text{Repeal}) - \text{Murders}_{MO,Before}(\text{No Repeal})\}}^{\text{Missouri observed trend}} - \\ & \underbrace{\{\color{red}{\text{Murders}_{MO,After}(\text{No Repeal})} - \text{Murders}_{MO,Before}(\text{No Repeal})\}}_{\color{red}{\text{Missouri counterfactual trend}}}\end{split}\end{equation}}\)

  • Before and After assumes the counterfactual trend is always 0 (or continuation of linear trend)

Many possible counterfactual trends…

Which counterfactual is right?

Which counterfactual is right?

Example: Gun Laws

We can’t know the counterfactual trend in Missouri…

but we can observe the trends in other states that did not change their gun purchasing laws (no change in Gun Control, \(X\)).

  • We can plug in the \(\text{factual TREND}\) in an “untreated” case (no change in \(X\)) for the \(\color{red}{\text{counterfactual TREND}}\) in the “treated” case (where \(X\) did change).

Arkansas has a different history that Missouri, so there are differences that are unchanging between them.

But, if Arkansas experiences same regional economic, political, cultural, weather trends as Missouri, they likely share the same trends over time.

Then, we can plug in

\(\small{\begin{equation}\begin{split} = {} & \overbrace{\{\text{Murders}_{MO,After}(\text{Repeal}) - \text{Murders}_{MO,Before}(\text{No Repeal})\}}^{\text{Missouri observed trend}} - \\ & \{\underbrace{\text{Murders}_{AR,After}(\text{No Repeal}) - \text{Murders}_{AR,Before}(\text{No Repeal})\}}_{\text{Arkansas observed trend}}\end{split}\end{equation}}\)

Missouri/Arkansas different in 2007, but if Missouri had same trend (counterfactually) as Arkansas, what would we expect Murders to have done in 2008 w/out the repeal?

Missouri’s counterfactual trend is parallel to / same as Arkansas’s factual trend

With your neighbors, discuss: Do you believe this is evidence of causality? What confounding does this address? What confounding does it not address?

Difference in Differences

Design Based Solution:

Like before and after, difference in differences comparisons are design based:

By comparing changes over time in “treated” (\(X\) changes) and “untreated” (\(X\) does not change) cases:

  • hold constant all unchanging confounding variables in both treated and untreated cases
  • hold constant all similarly changing confounding variables across treated and untreated cases

Regardless of whether we have thought of those variables, whether we can measure those variables.

Design: Difference in Differences

What is it?

  • Compare changes in “treated” cases before and after “treatment” to before and after changes in “untreated” cases

How does it work?

  • Hold constant unchanging attributes of cases (compare same case before and after “treatment”)
  • Hold constant variables that change together over time in both “treated” and “untreated” cases

Design: Difference in Differences

Why is it called difference in differences?

\(\small{\begin{equation}\begin{split} = {} & \overbrace{\{\text{Murders}_{MO,After}(\text{Repeal}) - \text{Murders}_{MO,Before}(\text{No Repeal})\}}^{\text{Missouri observed trend}} - \\ & \{\underbrace{\text{Murders}_{AR,After}(\text{No Repeal}) - \text{Murders}_{AR,Before}(\text{No Repeal})\}}_{\text{Arkansas observed trend}}\end{split}\end{equation}}\)

Design: Difference in Difference

So:

  • \(\mathrm{Difference \ 1} = Murders_{After} - Murders_{Before}\) gives us trend in murders in a \(State\)…
    • holding unchanging attributes of state constant (difference over time)
  • \(\mathrm{Difference \ 2} = \mathrm{Difference \ 1}_{Missouri} - \mathrm{Difference \ 1}_{Arkansas}\) gives us change in murders in \(Treated\) over time, compared to trend in \(Control\)
    • holds changing attributes of both states constant (difference in trends)

Design: Difference in Difference

\(Murder_{Before}\) \(Murder_{After}\) First Difference
\(\mathrm{Missouri}\) \(4.6\) \(6.2\) \(1.6\)
\(\mathrm{Arkansas}\) \(5.6\) \(5.4\) \(-0.2\)
Second Difference \(1.8\)

Design: Difference in Differences

Confounding Solved…

All confounding variables (affect whether a PTP repealed; affect firearms homicides) that are unchanging over time are held constant

  • comparing change over time with-in the same case

All confounding variables that change the similarly in “treated” and “untreated” case are held constant.

  • By comparing change over time in “treated” to change over time in “control”

Design: Difference in Differences

In order to infer \(X\) causes \(Y\) if \(X,Y\) correlated in difference-in-differences comparison…

Assumption

  • we assume the observed trend in \(Y\) for “untreated” case is equal to the “counterfactual trend” in \(Y\) for the “treated” case.
  • Equivalently: we assume “treated” and “untreated” have the “parallel trends” in \(Y\).
  • Equivalently: no variables that affect \(Y\) and change over time differently in “treated” and “untreated” cases

Do you believe assumption of “parallel trends”? (Counterfactual Missouri trend same as factual Arkansas trend)

Design: Difference in Differences

Confounding UNSolved…

  • Arkansas and Missouri murder rates mostly move together before 2007.
  • But large change in AR in 2001/2002 not in MO
  • But in 2007, before law took effect, murders dipped


Perhaps there are some things that affect murder rates that change differently in these two states.

Design: Difference in Differences

When is the assumption plausible?

  • We can check to see if cases share trends before treatment, but does not prove they would have shared trends after treatment
  • We should compare cases that experience many similar changes over time: comparing Missouri to British Columbia may not be helpful.

Solution How Bias
Solved
Which Bias
Removed
Assumes Internal
Validity
External
Validity
Experiment Randomization
Breaks \(W \rightarrow X\) link
All confounding variables 1. \(X\) is random
2. Change only \(X\)
Highest Lowest
Conditioning Hold confounders
constant
Only variables
conditioned on
see above Lowest Highest
Before and After Hold confounders
constant
variables
unchanging
over time
No causes of \(Y\)
change w/ \(X\)
Lower Higher
Diff in Diff Hold confounders
constant
unchanging and
similarly changing
Parallel trends Higher Lower

Application

Automation and Wages

Automation and Wages

Data:

  • Exposure to automation (\(X\)): industry-level investment in robotics and software and adoption of these technologies outside the US
  • Wages (\(Y\)): Hourly Real Wages for workers.

“Cases”:

  • Demographic groups defined by race, age, gender, education levels
  • For each group, calculate “exposure to automation” based on the industries they work in
  • For each group, calculate hourly real wages

Automation and Wages

What might be some confounding variables if we just compared wages for workers with exposure/no exposure to automation?

What might be some confounding variables if we just compared wages in the US before and after rise of automation?

Automation and Wages

Rather than looking at wages in industries with more automation, or change in wages in the US over time, use a difference in differences:

They compare:

  • Change in real wages for demographic groups with high exposure to automation between 1980 and 2016 (change in \(Y\) for group where \(X\) changes)

  • Change in real wages for demographic groups with low/no exposure to automation between 1980 and 2016 (change in \(Y\) for group where \(X\) does not change)

Assume that counterfactual trend in wages for workers exposed to automation SAME as factual trend in wages for workers not exposed to automation.

For groups with greater increase in automation exposure, greater decline in wages

Automation and Wages

Correlation suggestions Automation \(\xrightarrow{causes}\) declining wages

  • Can’t be confounding due to unchanging differences b/t demographic groups, industries
  • Can’t be confounding due to factors similarly affecting all groups (e.g. national/global changes)

For this to be the causal effect of automation, need to believe that wages for workers exposed / not exposed to automation would have been similar without automation…

No differences in wage trends before automation.

Automation and Wages

It still could be that other things that affect wages changed differently for workers exposed to automation than for those who were not.

  • Read the paper to see how authors rule out alternatives (e.g. moving manufacturing jobs elsewhere due to trade competition)

“capital takes what it will in the absence of constraints and technology is a tool that can be used for good or for ill… Yes, [during the Industrial Revolution of the 19th Century] you got progress, but you also had costs that were huge and very long-lasting. A hundred years of much harsher conditions for working people, lower real wages, much worse health and living conditions, less autonomy, greater hierarchy. And the reason that we came out of it wasn’t some law of economics, but rather a grass roots social struggle in which unions, more progressive politics and, ultimately, better institutions played a key role — and a redirection of technological change away from pure automation also contributed importantly.”

  • Daron Acemoglu

So… Luddites?

Conclusion