Let’s assume we have done our due diligence and…
We find an effect of this treatment on our outcome of interest.
To further probe our interpretation of this result against alternative interpretations we typically look at one or both of the following:
Did the local success of an explicitly secular political party, Indian National Congress (INC), reduce the likelihood of religious riots between Hindus and Muslims in India?
Nellis et al compare the incidence of riots in districts in which the INC barely won and barely lost MLA elections (w/ in 1 ppt).
They estimate the model:
\[\mathrm{Any \ Riot}_{it} = \alpha + \beta_1 \mathrm{INC \ Victory}_{it} + \epsilon_{it}\] Here we compute \(\mathrm{Any Riot}_it\) for each district between legislators taking office and the next election.
Any Riot | |
---|---|
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | |
Intercept | 0.236*** |
(0.024) | |
INC Win | −0.076* |
(0.031) | |
Num.Obs. | 644 |
R2 | 0.009 |
Std.Errors | HC0 |
Why does this effect occur? Why does the INC (a secular party) winning office reduce violence?
The INC portrayed itself as a secular party capable of protecting Muslims (a religious minority). Uniquely among parties, had a strong connection to Muslim voters.
Why does this effect occur?
Wilkinson (2004) argues that all political parties face incentives to compete over minority votes when those groups are electorally pivotal. So, all parties have incentives to stop religious riots in India when:
This implies we should see a smaller effect of INC win when there are more Muslims and when there are is a higher ENP.
In this example, Muslim population and Effective Number of Partes are moderators:
Acharya, Blackwell, and Sen (2016) examine the long-term effects of slavery in the United States.
They examine whether it affected contemporary racial attitudes among white Americans.
Survey data on white Americans in southern states between 1984-2011. Aggregated to county-level
Regression of average county attitudes on enslaved % in 1860.
How does slavery affect contemporary political attitudes?
what is the path?
Authors argue it is slavery \(\to\) preservation of political/economic power \(\to\) racist norms/institutions \(\to\) contemporary attitudes
But it could also be that there are OTHER mechanisms: slavery \(\to\) greater contemporary African American population \(\to\) contemporary perceived “racial threat” \(\to\) conteporary attitudes
In this case, 19th century racist norms and 20th century black population are on the path from slave population to racist attitudes. Expressed in terms of potential outcomes.
(Board)
We examine stories about moderators using either: split sample or interaction effects
We examine mediation using tools that estimate specific causal estimands
Interactions in regression permit slopes/intercept for one variable to differ across values of another variable.
We have simple model:
\(Y_i = \beta_0 + \beta_1 D_i + \beta_2 X_i\)
If we want to permit interaction between \(D\) and \(X\), add coefficient for variable that is product of \(D\) and \(X\)
\(Y_i = \beta_0 + \overbrace{\beta_1 D_i + \beta_2 X_i}^{\text{Main Terms}} + \underbrace{\beta_3D_i\cdot X_i}_{\text{Interaction}}\)
Always include main terms alongside the interaction term (otherwise assumes “effect of \(D\) is 0 when \(X\) is 0”)
Did the local success of an explicitly secular political party, Indian National Congress (INC), reduce the likelihood of religious riots between Hindus and Muslims in India?
Does the effect of INC victory differ between places with high/low ENP?
\[\displaylines{\mathrm{Any \ Riot}_{it} = \beta_0 + \beta_1 \mathrm{INC \ Win}_{it} + \beta_2 \text{High ENP}_{it} +\\\beta_3 \text{INC Win}_{it}\cdot \text{High ENP}_{it}}\] \[\displaylines{\mathrm{Any \ Riot}_{it} = \beta_0 + \beta_1 \mathrm{INC \ Win}_{it} + \beta_2 \text{MuslimHigh}_{it} + \\ \beta_3 \text{INC Win}_{it}\cdot \text{MuslimHigh}_{it}}\]
Where \(MuslimHigh\) and \(HighENP\) are \(1\) if above median, \(0\) if below.
This is a Binary by Binary interaction:
Let’s write out the design matrix…
\(\displaylines{\mathrm{Any \ Riot}_{it} = \beta_0 + \beta_1 \mathrm{INC \ Win}_{it} + \beta_2 \text{High ENP}_{it} +\\\beta_3 \text{INC Win}_{it}\cdot \text{High ENP}_{it}}\)
Different intercepts (means) for each group:
INC Lose | INC Win | |
---|---|---|
ENP Low | ? | ? |
ENP High | ? | ? |
What combinations of coefficients give us the mean probability of a riot for each cell in this table?
Different intercepts (means) for each group:
INC Lose | INC Win | |
---|---|---|
ENP Low | \(\beta_0\) | \(\beta_0 + \beta_1\) |
ENP High | \(\beta_0 + \beta_2\) | \(\beta_0 + \beta_1 + \beta_2 + \beta_3\) |
Let’s interpret what each coefficient means
\(\displaylines{\mathrm{Any \ Riot}_{it} = \beta_0 + \beta_1 \mathrm{INC \ Win}_{it} + \beta_2 \text{High ENP}_{it} +\\\beta_3 \text{INC Win}_{it}\cdot \text{High ENP}_{it}}\)
\(\beta_0\): Riot probability when INC loses in low ENP district
\(\beta_1\): Difference in Pr(Riot) when INC win (vs loses) in low ENP district
\(\beta_2\): Difference in Pr(Riot) when INC loses in high (vs low) ENP district
\(\beta_3\): Difference in change in Pr(Riot) (INC win vs lose) in High vs Low ENP district
Easy to do in R
:
m2 = lm(any_riot ~ INC_win*ENP_high,
data = riots)
m3 = lm(any_riot ~ INC_win + muslim_high + INC_win:muslim_high,
data = riots)
*
expands out the ‘main’ effects and interaction
effects
:
just multiplies two variables together, no ‘main’
effects
modelsummary(list("Any Riot" = m2),
output = 'html',
vcov = "HC1",
stars = T,
gof_omit = 'DF|Deviance|AIC|BIC|Adj|Lik|F|RMSE',
coef_rename = c( "(Intercept)" = "Intercept", 'INC_win' = "INC Win", 'ENP_highTRUE' = "High ENP", "INC_win:ENP_highTRUE" = "INC Win x High ENP", "muslim_high" = "High Muslim", "INC_win:muslim_high" = "INC Win x High Muslim"))
Any Riot | |
---|---|
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | |
Intercept | 0.242*** |
(0.035) | |
INC Win | −0.084+ |
(0.045) | |
High ENP | −0.009 |
(0.048) | |
INC Win x High ENP | 0.012 |
(0.063) | |
Num.Obs. | 636 |
R2 | 0.010 |
Std.Errors | HC1 |
\(\displaylines{\mathrm{Any \ Riot}_{it} = \beta_0 + \beta_1 \mathrm{INC \ Win}_{it} + \beta_2 \text{High ENP}_{it} +\\\beta_3 \text{INC Win}_{it}\cdot \text{High ENP}_{it}}\)
If we use hypothesis test of \(\beta_3\), can we tell whether the effect of INC Win in High ENP districts is different from \(0\)?
If we use hypothesis test of \(\beta_1\), can we tell whether the effect of INC Win in Low ENP districts is different from \(0\)?
If moderator is binary interaction term lets us test whether there is a significant difference in treatment effects for the two sub-groups.
If you want to test whether the effect in each group is different from \(0\), you have to look at “main effect” of treatment (you can then reverse code the moderator, e.g. interact with “Low ENP”)
Different intercepts (means) for each group:
Untreated | Treated | |
---|---|---|
Pre | Intercept |
Intercept + Treated |
Post | Intercept + Post |
Intercept + Treated + Post + Treated:Post |
lm(Y ~ Treated*Post, data = data)
If we have many treated, many untreated units \(i\), across multiple time periods \(t\), it is common to do this:
lm(Y ~ Treated*Post + dummy_i + dummy_t, data = data)
fixest::feols(Y ~ Treated*Post | dummy_i + dummy_t, data = data)
If we include a dummy for each unit \(i\) and a dummy for each time period \(t\)… Which of these four coefficients will we be able to actually estimate? (think about linear independence)
Intercept + Treated_i + Post_t + Treated_i:Post_t
When \(D\) and moderator \(M\) are both binary, interaction effects are very straightforward.
Rather than fitting different intercepts for groups:
Assumptions
(1) | |
---|---|
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | |
Intercept | 0.229* |
(0.106) | |
INC Win | −0.076 |
(0.135) | |
ENP | 0.003 |
(0.035) | |
INC Win x ENP | −0.001 |
(0.044) | |
Num.Obs. | 636 |
R2 | 0.010 |
Std.Errors | HC1 |
Interpretation of coefficients?
How can we make this more interpretable?
Original | Mean Centered | |
---|---|---|
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | ||
Intercept | 0.229* | 0.237*** |
(0.106) | (0.024) | |
INC Win | −0.076 | −0.078* |
(0.135) | (0.032) | |
ENP | 0.003 | 0.003 |
(0.035) | (0.035) | |
INC Win x ENP | −0.001 | −0.001 |
(0.044) | (0.044) | |
Num.Obs. | 636 | 636 |
R2 | 0.010 | 0.010 |
Std.Errors | HC1 | HC1 |
In this context, we are more interested in marginal effects:
We also want to be careful assuming that the effect of INC Win changes linearly with ENP?
Best to use interflex
package in R
. Based on Hainmueller
et al 2018.
Check common support
## Baseline group: treat = 0
“Binning” Estimator
## Baseline group: treat = 0
“Binning” Estimator
“Kernel” Estimator
Assumptions
Download Riots data:
riots = read.csv("https://www.dropbox.com/scl/fi/f5h9iz8qtvqdwv1frgqnv/l13_riots.csv?rlkey=155uky6xqvg7ro4vcopwddoic&dl=1")
lm
to regress any_riot
on
binary x binary interaction of INC_win
by
muslim_high
.modelsummary
.INC_win
and interaction.interflex
“binning” estimator to look at
binary x continuous interaction of INC_win
by muslim_prop
.INC_win
linear in muslim_prop
? When is effect
significantly different from \(0\)?)suffrage = read.csv("https://www.dropbox.com/scl/fi/jr6hza8yodevb0ptrhty2/l13_suff_interaction.csv?rlkey=sdtsiak5rkbs3gspdwq1zhz83&dl=1")
Y is diff_suff
D is vet_pct
X is
c_company_kia
FE is state
weights are
elig_1865
interflex
plot (you will have to leave
out weights
)interflex
plotinterflex
package to handle thisWe found that:
Overall, this rejects the interpretation that suppression of riots is due to a logic where all parties face competition pressures to stop anti-minority violence.
Instead, suggests there is link between INC and support drawn from Muslim voters.
If we don’t proceed from theoretical arguments and look at heterogeneous effects (interactions) by many different attributes…
There methods to account for looking at multiple hypothesis tests:
These and other corrections available in p.adjust
function.
If we find there is a difference between sub-groups, we cannot (typically) say that it is THIS attribute that causes a difference.
How does slavery affect contemporary political attitudes?
what is the path?
Authors argue it is slavery \(\to\) preservation of political/economic power \(\to\) racist norms/institutions \(\to\) contemporary attitudes
But it could also be that there are OTHER mechanisms: slavery \(\to\) greater contemporary African American population \(\to\) contemporary perceived “racial threat” \(\to\) conteporary attitudes
We may be interested in three different situations:
We may be interested in three different situations:
Natural Direct and Indirect Effects can sum up to the total effect of \(D\).
natural indirect effect
\[\delta_i(d) = Y_i(d, M_i(1)) - Y_i(d, M_i(0))\] What is effect of changing \(M\) (going from \(M(D=0)\) to \(M(D=1)\)) but not changing \(D\)?
What is effect of changing 21st c. black population (as if changing slave population), without changing slave population?
natural direct effect
\[\zeta_i(d) = Y_i(1, M_i(d)) - Y_i(0, M_i(d))\] What is effect of changing \(D\) (going from \((D=0)\) to \((D=1)\)) but not changing \(M\) (from what is induced by \(d\)?
What is effect of changing slave population, without changing 21st c. black population population induced by the actual slave population?
controlled direct effect
\[\lambda_i(d) = Y_i(1, m) - Y_i(0, m)\]
What is effect of changing \(D\) (going from \((D=0)\) to \((D=1)\)) but holding \(M\) at value \(m\)
What is effect of changing slave population, holding 21st c. black population population constant at some value?
If \(M\) fully mediates \(D\), then \(CDE\) will be \(0\) for all values of \(m\). (If not, then there is another path other than \(D \to M \to Y\))
Natural Direct Effect and Natural Indirect Effect assume:
this is sequential ignorability
Natural Direct Effect and Natural Indirect Effect assume:
Sequential ignorability
If, e.g., Slave Population \(\to\) Property values \(\to\) Attitudes. And Property Value \(\to\) Black Population (M), \(M\) is confounded.
Basically, if there other ways treatment can affect outcome, and those other paths possibly affect mediator \(M\), there will be bias. This is almost alway possible.
Controlled Direct Effects assume:
this is sequential ignorability version 2:
\(Z\) is a vector of post-treatment confounders. (e.g., property values). We want to “block” the other paths by which \(D\) might affect \(Y\).
Controlled Direct Effects assume:
mediation
package. The
assumptions for these are very hard to meet, EVEN IN RANDOMIZED
EXPERIMENTSDirectEffects
package, vignette
hereOverall, this is a very contested area methodologically.
One one hand you have Acharya et al 2016 advocating for methods to get at mediating paths.
On the other hand you have Bullock and Green 2010 who basically suggest that we can’t do it at all.