March 29, 2019

Correlation to Causation

Plan for Today:

(1) Solutions for Bias

  • design-based
    • difference in difference
    • natural experiments

Design-Based Solutions

Design

Types of designs:

Designs using conditioning

  • Compare same case over time
  • Compare cases known to be similar at same time
  • "Differences in Differences"

Designs using random exposure to \(X\)

  • experiments
  • "natural experiments"

Design: Difference in Difference

What is it?

  • Compare changes in "treated" cases to changes in "untreated" cases before and after the "treatment"

How does it work?

  • Hold constant unchanging attributes of cases (compare same case before and after "treatment")
  • Hold constant variables that change together over time in "treated" and "untreated" cases

Assumption:

  • "Untreated" case has the trend the "Treated" case would have had except for the "treatment"
  • no variables that affect \(Y\) and change over time differently in "treated" and "untreated" cases

Difference in Difference: Example

Mueller and Schwarz 2018

Does social-media hate speech lead to real-world violence?

  • Anti-Refugee content on Facebook and Anti-Refugee violence in Germany
  • Does anti-refugee hate speech on AfD (far right party) Facebook page cause increases in anti-refugee violence?

Difference in Difference: Example

Mueller and Schwarz 2018

Data

  • anti-refugee violence in Germany by week (and by town)
  • anti-refugee comments on AfD Facebook page (by week)
  • AfD Facebook followers per capita (by town)
  • Nutella Facebook followers per capita (by town)

Difference in Difference: Example

Mueller and Schwarz 2018

Research options:

  • "Same case over time": compare Germany-wide anti-refugee violence following Facebook hate speech

Difference in Difference: Example

Difference in Difference: Example

Mueller and Schwarz 2018

Research options:

  • "Same case over time": compare Germany-wide anti-refugee violence following Facebook hate speech
    • Could be that other events (terror attacks, political disputes over refugees) cause both hate speech AND violence (confounding)
  • "Similar cases at same time": compare anti-refugee violence over same period in communities with many AfD facebook followers versus few AfD followers

Difference in Difference: Example

Difference in Difference: Example

Mueller and Schwarz 2018

Research options:

  • "Same case over time": compare Germany-wide anti-refugee violence following Facebook hate speech
    • Could be that other events (terror attacks, political disputes over refugees) cause both hate speech AND violence (confounding variables over time)
  • "Similar cases at same time": compare anti-refugee violence over same period in communities with many AfD facebook followers versus few AfD followers
    • Could be that people who follow AfD on Facebook more likely to commit violence any way. ("similar" cases still fundamentally different)

Difference in Difference: Example

Difference-in-Difference:

Consider 2 types of towns: \(Town_T\) and \(Town_C\) at two times: \(Before\) and \(After\) AfD anti-Refugee comment

  • \(Town_T\) has many AfD Facebook followers
  • \(Town_C\) has few AfD Facebook followers

We measure \(Hate \ Crimes\) (\(Y\)) in both towns

Difference in Difference: Example

Difference-in-Difference:

  • \(FirstDiff = Hate \ Crime_{After} - Hate \ Crime_{Before}\) gives us change in murders in a \(Town\)…
    • holding unchanging attributes of town (presence of refugees, Neo-Nazis, etc.) constant (same case over time)
  • \(SecondDiff = FirstDiff_{T} - FirstDiff_{C}\) gives us change in murders in \(Treated\) towns over time, compared to \(Control\)
    • holding shared trends of both towns (exposure to political events/news) constant (similar cases at same points in time)

Difference in Difference: Example

Mueller and Schwarz 2018

Using difference-in-difference, find that:

  1. In weeks with more hateful Facebook Posts (compared to weeks with less)…
  2. Towns with more AfD or Nutella Facebook followers (compared to towns with less)…
  3. see more anti-refugee violence

Anti-Refugee violence is rare, but "high" Facebook usage areas have 60% more when there a more anti-refugee posts

Difference in Difference: Example

Assumptions for causality:

  • No variables that affect anti-refugee violence that change differently over time in places with more Facebook users as in places with fewer

Difference in Difference: Example

A problem:

What if events involving refugees (crime, influx of refugees, etc.) makes white supremacists more angry but has no effect on non-white supremacists?

  • something changing over time (events involving refugees causes ANGER) that
    • increases anti-refugee Facebook content
    • separately, increases anti-refugee violence
    • does so more in places that have more AfD Facebook followers.

Something that affect violence but changes differently over time in "treated" and "untreated" communities

Confounding!

Design: Natural Experiments

What is it?:

  • Comparison of cases where value of \(X\) is determined random (or approximately random)

How does it work?

  • Because \(X\) is random, it is unrelated to any \(W\) that causes \(Y\); confounding not possible
  • Cases with different values of \(X\) are, on average, the same on all other attributes

Assumption for causality

  • \(X\) is actually random

Design: Natural Experiments

  1. "Standard" Natural Experiments: \(X\) is decided at random
    • Example: Does having more money make you more conservative on taxation?
    • Solution: compare lottery winners and losers
  2. Arbitrary Cut-offs:
    • Does electing secular political parties reduced religious violence?
    • Solution: Compare places where secular parties barely won to where they barely lost
  3. Something random (\(Z\)) that affects \(X\) for some cases

Natural Experiments: Example

Is there anything random that affects exposure to Facebook hate speech?

Internet outages:

  • where and when they occur is likely random, unrelated to
    • community's baseline white supremacism
    • major events in Germany involving refugees
  • but internet outages reduce exposure to Facebook

Natural Experiments: Example

When compared to places without internet outages, places with internet outages

  1. In weeks with more hateful Facebook Posts (compared to weeks with less)…
  2. Towns with more AfD or Nutella Facebook followers (compared to towns with less)…
  3. see no differene in anti-refugee violence

Random loss of internet leads to no effect of Facebook posts on anti-refugee violence

How Bias
Solved
Which Bias
Removed
Assumes Internal
Validity
External
Validity
Adjustment Hold
constant
All measured
confounding variables
Condition all
confounders
Lowest Highest
Similar Cases Hold
constant
Cases' shared
confounding variables
No diff.
b/t cases
Middle Middle
Same Case Hold
constant
Case's unchanging
confounding variables
No confounding
trends
Middle Middle
Diff in Diff Hold
constant
Case's unchanging variables
Cases' shared trends
Cases have
parallel trends
Higher Lower
Natural Experiment Break \(W \rightarrow X\) link All confounding variables \(X\) as-if random Highest Lowest