#lecture outline
recap
what are we we asking from regression
effects — differences in means/average slopes: interpret
coefficients as estimating something… what is being estimated?
how are observations being weighted?
plug in values of e y | X: how are values being plugged
in
implications/insights
insight: minimizes overall distance from y to y-hat: - does not fit
mean everywhere - puts more weight on minimizing distance where there
are more cases - best linear approximation
regression and best linear approximation of CEF: - regress on group
averages… w/ weights, ignoring weights. -
insight: minimizes overall distance from y to yhat- puts more weight
on minimizing distance where there are more cases.
y is linear in coefficients
implications for imputing, implications for controlling
(conditioning on X, assume D is linear in X)
linear: multiplicative and additive => implications for how to
set up design matrix to get right coefficients
coefficients: fitting means — fit difference in means: keep in
mind predictions
link to equation - design matrix
just vector of 1s
vector of 2s
- if we change scale of x variables, scale of betas changes: x times
c, then beta times 1/c
vector of 1s, vector, 0,1
vector 0,1, vector 0,1
vectors of 1s, 0,1, 0,1 (non independent) => dummy vars,
reference categories
diff in diff:
- how can we get regression to calculate the relevant
differences?
- 4 cases: treated, untreated, pre post — how many columns, what
values?
lessons:
we can get identical predictions of y, but different coefficients
with different interpretations depending on how we set up design matrix
(can be useful)
can get group means directly from least squares
continuous, no controls: angrist and pischke: bivariate… weighted
average of derivative (weighted by what?)… intercept meaning
- intercept meaning
- what if we center the continuous variable around its mean before
hand?
coefficients with controls:
dummy variables: if 0/1 always able to pretty clearly interpret for
specific groups;;; if 0/1 are not exclusive… what is going on? - dummy
variable orthogonal to the other — what happens? how to interpret -
centered around the mean - gender, profession, earnings.. plug in mean
gender for profession…. what is implication of this… - recall the
coefficient for slope - weighting — what if all doctors were female?
what weight would go
conditioning on a dummy: rainfall… relationship b/t rainfall and
brexit vote… how to interpret… 0 correlation with indicators for region…
residual rainfall… what happens then?
continunous: age, conditioning on hours worked (continuous) -
what does this mean?
math: orthogonal - 0 correlation
interpretation:
regression weights insight
if this is wrong, what can we do?
- dummies - at every single value of x
transformations: log, sqrt, squared or highr polynomial
linearity and transformations. —
- linear in terms
- polynomials can perfectly fit data… but…
- we no risk overfitting the data
- we no longer get clearly interpretable coefficients
additivity andtransformations
- gender x profession, interaction
LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKI2xlY3R1cmUgb3V0bGluZQoKIyMgcmVjYXAKCi0gb3J0aG9nb25hbCBwcm9qZWN0aW9uCi0gbWVhbgotIGJpdmFyaWF0ZSAoaWxsdXN0cmF0ZSkKLSBtdWx0aXZhcmlhdGUgKG1lYW5pbmcpCgotIGtleSBmYWN0cyBpbXBsaWVkIGJ5IG1hdGggb2YgbHMKICAtIHJlc2lkdWFscyBvcnRob2dvbmFsIHRvIHByZWRpY3RvcnMKICAtIGJjIHdlIHdhbnQgdG8gYWRkIHVwIHggdmFycy4uLiB0aGV5IGFyZSBiYXNpcyB2ZWN0b3JzIG9uIDJkIHBsYW5lICgzK2QgaHlwZXJwbGFuZSkgLSBzbyBjb2VmZmljaWVudHMgcmVmbGVjdCBvcnRob2dvbmFsIHZhcmlhdGlvbiBpbiB4ICAoaWxsdXN0cmF0ZSBvbiBib2FyZCAtLS0gcmVzaWR1YWwgeCBvbiB5CiAgICAtIGV4YW1wbGUgCiAgICAtIGFsdGVybmF0ZSBpbnRlcnByZXRhdGlvbjogcHJlZGljdGVkIHggaGF0IC0tLSBkZXZpYXRpb24gZnJvbSBtZWFuIG9mIHggaGF0CiAgLSBtYXRoIHJlcXRzOiBsaW5lYXIgaW5kZXBlbmRlbmNlCiAgICAtIHdoeT8gaWYgbm8gcmVzaWR1YWwgeC4uLiBkaXZpZGUgYnkgMAogICAgCgp3aGF0IGFyZSB3ZSB3ZSBhc2tpbmcgZnJvbSByZWdyZXNzaW9uCgotIGVmZmVjdHMgLS0tIGRpZmZlcmVuY2VzIGluIG1lYW5zL2F2ZXJhZ2Ugc2xvcGVzOiBpbnRlcnByZXQgY29lZmZpY2llbnRzIGFzIGVzdGltYXRpbmcgc29tZXRoaW5nLi4uIHdoYXQgaXMgYmVpbmcgZXN0aW1hdGVkPwoKLSBob3cgYXJlIG9ic2VydmF0aW9ucyBiZWluZyB3ZWlnaHRlZD8KCi0gcGx1ZyBpbiB2YWx1ZXMgb2YgZSB5IHwgWDogaG93ICBhcmUgdmFsdWVzIGJlaW5nIHBsdWdnZWQgaW4KCgojIyBpbXBsaWNhdGlvbnMvaW5zaWdodHMKCmluc2lnaHQ6IG1pbmltaXplcyBvdmVyYWxsIGRpc3RhbmNlIGZyb20geSB0byB5LWhhdDogCiAgLSBkb2VzIG5vdCBmaXQgbWVhbiBldmVyeXdoZXJlCiAgLSBwdXRzIG1vcmUgd2VpZ2h0IG9uIG1pbmltaXppbmcgZGlzdGFuY2Ugd2hlcmUgdGhlcmUgYXJlIG1vcmUgY2FzZXMKICAtIGJlc3QgbGluZWFyIGFwcHJveGltYXRpb24KCnJlZ3Jlc3Npb24gYW5kIGJlc3QgbGluZWFyIGFwcHJveGltYXRpb24gb2YgQ0VGOgogIC0gcmVncmVzcyBvbiBncm91cCBhdmVyYWdlcy4uLiB3LyB3ZWlnaHRzLCBpZ25vcmluZyB3ZWlnaHRzLgogIC0gCgoKaW5zaWdodDogbWluaW1pemVzIG92ZXJhbGwgZGlzdGFuY2UgZnJvbSB5IHRvIHloYXQtIHB1dHMgbW9yZSB3ZWlnaHQgb24gbWluaW1pemluZyBkaXN0YW5jZSB3aGVyZSB0aGVyZSBhcmUgbW9yZSBjYXNlcy4KCgojIyMjCnkgaXMgbGluZWFyIGluIGNvZWZmaWNpZW50cwoKLSBpbXBsaWNhdGlvbnMgZm9yIGltcHV0aW5nLCBpbXBsaWNhdGlvbnMgZm9yIGNvbnRyb2xsaW5nIChjb25kaXRpb25pbmcgb24gWCwgYXNzdW1lIEQgaXMgbGluZWFyIGluIFgpCgotIGxpbmVhcjogbXVsdGlwbGljYXRpdmUgYW5kIGFkZGl0aXZlICA9PiBpbXBsaWNhdGlvbnMgZm9yIGhvdyB0byBzZXQgdXAgZGVzaWduIG1hdHJpeCB0byBnZXQgcmlnaHQgY29lZmZpY2llbnRzCgoKLSBjb2VmZmljaWVudHM6IGZpdHRpbmcgbWVhbnMgLS0tIGZpdCBkaWZmZXJlbmNlIGluIG1lYW5zOiBrZWVwIGluIG1pbmQgcHJlZGljdGlvbnMKICAtIGxpbmsgdG8gZXF1YXRpb24gLSBkZXNpZ24gbWF0cml4CiAgLSBqdXN0IHZlY3RvciBvZiAxcwogIC0gdmVjdG9yIG9mIDJzCiAgICAtIGlmIHdlIGNoYW5nZSBzY2FsZSBvZiB4IHZhcmlhYmxlcywgc2NhbGUgb2YgYmV0YXMgY2hhbmdlczogeCB0aW1lcyBjLCB0aGVuIGJldGEgdGltZXMgMS9jCiAgLSB2ZWN0b3Igb2YgMXMsIHZlY3RvciwgMCwxCiAgLSB2ZWN0b3IgMCwxLCB2ZWN0b3IgMCwxCiAgLSB2ZWN0b3JzIG9mIDFzLCAwLDEsIDAsMSAobm9uIGluZGVwZW5kZW50KSA9PiBkdW1teSB2YXJzLCByZWZlcmVuY2UgY2F0ZWdvcmllcwoKICAKICAKICAtIGRpZmYgaW4gZGlmZjogCiAgICAtIGhvdyBjYW4gd2UgZ2V0IHJlZ3Jlc3Npb24gdG8gY2FsY3VsYXRlIHRoZSByZWxldmFudCBkaWZmZXJlbmNlcz8KICAgIC0gNCBjYXNlczogdHJlYXRlZCwgdW50cmVhdGVkLCBwcmUgcG9zdCAtLS0gaG93IG1hbnkgY29sdW1ucywgd2hhdCB2YWx1ZXM/CiAgCiAgbGVzc29uczogCiAgCiAgLSB3ZSBjYW4gZ2V0IGlkZW50aWNhbCBwcmVkaWN0aW9ucyBvZiB5LCBidXQgZGlmZmVyZW50IGNvZWZmaWNpZW50cyB3aXRoIGRpZmZlcmVudCBpbnRlcnByZXRhdGlvbnMgZGVwZW5kaW5nIG9uIGhvdyB3ZSBzZXQgdXAgZGVzaWduIG1hdHJpeCAoY2FuIGJlIHVzZWZ1bCkKICAtIGNhbiBnZXQgZ3JvdXAgbWVhbnMgZGlyZWN0bHkgZnJvbSBsZWFzdCBzcXVhcmVzCiAgCiAgCiAgLSBjb250aW51b3VzLCBubyBjb250cm9sczogYW5ncmlzdCBhbmQgcGlzY2hrZTogYml2YXJpYXRlLi4uIHdlaWdodGVkIGF2ZXJhZ2Ugb2YgZGVyaXZhdGl2ZSAod2VpZ2h0ZWQgYnkgd2hhdD8pLi4uIGludGVyY2VwdCBtZWFuaW5nCiAgICAtIGludGVyY2VwdCBtZWFuaW5nCiAgICAtIHdoYXQgaWYgd2UgY2VudGVyIHRoZSBjb250aW51b3VzIHZhcmlhYmxlICBhcm91bmQgaXRzIG1lYW4gYmVmb3JlIGhhbmQ/CiAgCmNvZWZmaWNpZW50cyB3aXRoIGNvbnRyb2xzOgoKICBkdW1teSB2YXJpYWJsZXM6IGlmIDAvMSBhbHdheXMgYWJsZSB0byBwcmV0dHkgY2xlYXJseSBpbnRlcnByZXQgZm9yIHNwZWNpZmljIGdyb3Vwczs7OyBpZiAwLzEgYXJlIG5vdCBleGNsdXNpdmUuLi4gd2hhdCBpcyBnb2luZyBvbj8KICAgIC0gZHVtbXkgdmFyaWFibGUgb3J0aG9nb25hbCB0byB0aGUgb3RoZXIgLS0tIHdoYXQgaGFwcGVucz8gaG93IHRvIGludGVycHJldAogICAgLSBjZW50ZXJlZCBhcm91bmQgdGhlIG1lYW4KICAgIC0gZ2VuZGVyLCBwcm9mZXNzaW9uLCBlYXJuaW5ncy4uIHBsdWcgaW4gbWVhbiBnZW5kZXIgZm9yIHByb2Zlc3Npb24uLi4uIHdoYXQgaXMgaW1wbGljYXRpb24gb2YgdGhpcy4uLgogICAgLSByZWNhbGwgdGhlIGNvZWZmaWNpZW50IGZvciBzbG9wZQogICAgLSB3ZWlnaHRpbmcgLS0tIHdoYXQgaWYgYWxsIGRvY3RvcnMgd2VyZSBmZW1hbGU/IHdoYXQgd2VpZ2h0IHdvdWxkIGdvIAogIAoKICAKICAgIAogIC0gY29uZGl0aW9uaW5nIG9uIGEgZHVtbXk6IHJhaW5mYWxsLi4uIHJlbGF0aW9uc2hpcCBiL3QgcmFpbmZhbGwgYW5kIGJyZXhpdCB2b3RlLi4uIGhvdyB0byBpbnRlcnByZXQuLi4gMCBjb3JyZWxhdGlvbiB3aXRoIGluZGljYXRvcnMgZm9yIHJlZ2lvbi4uLiByZXNpZHVhbCByYWluZmFsbC4uLiB3aGF0IGhhcHBlbnMgdGhlbj8KICAKICAtIGNvbnRpbnVub3VzOiBhZ2UsIGNvbmRpdGlvbmluZyBvbiBob3VycyB3b3JrZWQgKGNvbnRpbnVvdXMpIC0gd2hhdCBkb2VzIHRoaXMgbWVhbj8KICAgIC0gbWF0aDogb3J0aG9nb25hbCAtIDAgY29ycmVsYXRpb24KICAgIC0gaW50ZXJwcmV0YXRpb246IAogICAgLSByZWdyZXNzaW9uIHdlaWdodHMgaW5zaWdodAogICAgLSBpZiB0aGlzIGlzIHdyb25nLCB3aGF0IGNhbiB3ZSBkbz8KICAgIAogICAgICAtIGR1bW1pZXMgLSBhdCBldmVyeSBzaW5nbGUgdmFsdWUgb2YgeCAKICAgICAgCiAgICAtIHRyYW5zZm9ybWF0aW9uczogbG9nLCBzcXJ0LCBzcXVhcmVkIG9yIGhpZ2hyIHBvbHlub21pYWwKICAgICAgCiAgCiAgCiAgLSBsaW5lYXJpdHkgYW5kIHRyYW5zZm9ybWF0aW9ucy4gLS0tIAogICAgLSBsaW5lYXIgaW4gdGVybXMgCiAgICAtIHBvbHlub21pYWxzIGNhbiBwZXJmZWN0bHkgZml0IGRhdGEuLi4gYnV0Li4uCiAgICAgIC0gd2Ugbm8gcmlzayBvdmVyZml0dGluZyB0aGUgZGF0YQogICAgICAtIHdlIG5vIGxvbmdlciBnZXQgY2xlYXJseSBpbnRlcnByZXRhYmxlIGNvZWZmaWNpZW50cwogICAgLSAKICAgICAgCiAgICAgIAogICAgICAKICAgICAgCiAgICAgIAogICAgICAKICAtIGFkZGl0aXZpdHkgYW5kdHJhbnNmb3JtYXRpb25zCiAgICAtIGdlbmRlciB4IHByb2Zlc3Npb24sIGludGVyYWN0aW9uCiAgICAgIAogIA==