#lecture outline

recap

what are we we asking from regression

implications/insights

insight: minimizes overall distance from y to y-hat: - does not fit mean everywhere - puts more weight on minimizing distance where there are more cases - best linear approximation

regression and best linear approximation of CEF: - regress on group averages… w/ weights, ignoring weights. -

insight: minimizes overall distance from y to yhat- puts more weight on minimizing distance where there are more cases.

y is linear in coefficients

  • implications for imputing, implications for controlling (conditioning on X, assume D is linear in X)

  • linear: multiplicative and additive => implications for how to set up design matrix to get right coefficients

  • coefficients: fitting means — fit difference in means: keep in mind predictions

    • link to equation - design matrix

    • just vector of 1s

    • vector of 2s

      • if we change scale of x variables, scale of betas changes: x times c, then beta times 1/c
    • vector of 1s, vector, 0,1

    • vector 0,1, vector 0,1

    • vectors of 1s, 0,1, 0,1 (non independent) => dummy vars, reference categories

    • diff in diff:

      • how can we get regression to calculate the relevant differences?
      • 4 cases: treated, untreated, pre post — how many columns, what values?

    lessons:

    • we can get identical predictions of y, but different coefficients with different interpretations depending on how we set up design matrix (can be useful)

    • can get group means directly from least squares

    • continuous, no controls: angrist and pischke: bivariate… weighted average of derivative (weighted by what?)… intercept meaning

      • intercept meaning
      • what if we center the continuous variable around its mean before hand?

coefficients with controls:

dummy variables: if 0/1 always able to pretty clearly interpret for specific groups;;; if 0/1 are not exclusive… what is going on? - dummy variable orthogonal to the other — what happens? how to interpret - centered around the mean - gender, profession, earnings.. plug in mean gender for profession…. what is implication of this… - recall the coefficient for slope - weighting — what if all doctors were female? what weight would go

  • conditioning on a dummy: rainfall… relationship b/t rainfall and brexit vote… how to interpret… 0 correlation with indicators for region… residual rainfall… what happens then?

  • continunous: age, conditioning on hours worked (continuous) - what does this mean?

    • math: orthogonal - 0 correlation

    • interpretation:

    • regression weights insight

    • if this is wrong, what can we do?

      • dummies - at every single value of x
    • transformations: log, sqrt, squared or highr polynomial

  • linearity and transformations. —

    • linear in terms
    • polynomials can perfectly fit data… but…
      • we no risk overfitting the data
      • we no longer get clearly interpretable coefficients
  • additivity andtransformations

    • gender x profession, interaction
LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKI2xlY3R1cmUgb3V0bGluZQoKIyMgcmVjYXAKCi0gb3J0aG9nb25hbCBwcm9qZWN0aW9uCi0gbWVhbgotIGJpdmFyaWF0ZSAoaWxsdXN0cmF0ZSkKLSBtdWx0aXZhcmlhdGUgKG1lYW5pbmcpCgotIGtleSBmYWN0cyBpbXBsaWVkIGJ5IG1hdGggb2YgbHMKICAtIHJlc2lkdWFscyBvcnRob2dvbmFsIHRvIHByZWRpY3RvcnMKICAtIGJjIHdlIHdhbnQgdG8gYWRkIHVwIHggdmFycy4uLiB0aGV5IGFyZSBiYXNpcyB2ZWN0b3JzIG9uIDJkIHBsYW5lICgzK2QgaHlwZXJwbGFuZSkgLSBzbyBjb2VmZmljaWVudHMgcmVmbGVjdCBvcnRob2dvbmFsIHZhcmlhdGlvbiBpbiB4ICAoaWxsdXN0cmF0ZSBvbiBib2FyZCAtLS0gcmVzaWR1YWwgeCBvbiB5CiAgICAtIGV4YW1wbGUgCiAgICAtIGFsdGVybmF0ZSBpbnRlcnByZXRhdGlvbjogcHJlZGljdGVkIHggaGF0IC0tLSBkZXZpYXRpb24gZnJvbSBtZWFuIG9mIHggaGF0CiAgLSBtYXRoIHJlcXRzOiBsaW5lYXIgaW5kZXBlbmRlbmNlCiAgICAtIHdoeT8gaWYgbm8gcmVzaWR1YWwgeC4uLiBkaXZpZGUgYnkgMAogICAgCgp3aGF0IGFyZSB3ZSB3ZSBhc2tpbmcgZnJvbSByZWdyZXNzaW9uCgotIGVmZmVjdHMgLS0tIGRpZmZlcmVuY2VzIGluIG1lYW5zL2F2ZXJhZ2Ugc2xvcGVzOiBpbnRlcnByZXQgY29lZmZpY2llbnRzIGFzIGVzdGltYXRpbmcgc29tZXRoaW5nLi4uIHdoYXQgaXMgYmVpbmcgZXN0aW1hdGVkPwoKLSBob3cgYXJlIG9ic2VydmF0aW9ucyBiZWluZyB3ZWlnaHRlZD8KCi0gcGx1ZyBpbiB2YWx1ZXMgb2YgZSB5IHwgWDogaG93ICBhcmUgdmFsdWVzIGJlaW5nIHBsdWdnZWQgaW4KCgojIyBpbXBsaWNhdGlvbnMvaW5zaWdodHMKCmluc2lnaHQ6IG1pbmltaXplcyBvdmVyYWxsIGRpc3RhbmNlIGZyb20geSB0byB5LWhhdDogCiAgLSBkb2VzIG5vdCBmaXQgbWVhbiBldmVyeXdoZXJlCiAgLSBwdXRzIG1vcmUgd2VpZ2h0IG9uIG1pbmltaXppbmcgZGlzdGFuY2Ugd2hlcmUgdGhlcmUgYXJlIG1vcmUgY2FzZXMKICAtIGJlc3QgbGluZWFyIGFwcHJveGltYXRpb24KCnJlZ3Jlc3Npb24gYW5kIGJlc3QgbGluZWFyIGFwcHJveGltYXRpb24gb2YgQ0VGOgogIC0gcmVncmVzcyBvbiBncm91cCBhdmVyYWdlcy4uLiB3LyB3ZWlnaHRzLCBpZ25vcmluZyB3ZWlnaHRzLgogIC0gCgoKaW5zaWdodDogbWluaW1pemVzIG92ZXJhbGwgZGlzdGFuY2UgZnJvbSB5IHRvIHloYXQtIHB1dHMgbW9yZSB3ZWlnaHQgb24gbWluaW1pemluZyBkaXN0YW5jZSB3aGVyZSB0aGVyZSBhcmUgbW9yZSBjYXNlcy4KCgojIyMjCnkgaXMgbGluZWFyIGluIGNvZWZmaWNpZW50cwoKLSBpbXBsaWNhdGlvbnMgZm9yIGltcHV0aW5nLCBpbXBsaWNhdGlvbnMgZm9yIGNvbnRyb2xsaW5nIChjb25kaXRpb25pbmcgb24gWCwgYXNzdW1lIEQgaXMgbGluZWFyIGluIFgpCgotIGxpbmVhcjogbXVsdGlwbGljYXRpdmUgYW5kIGFkZGl0aXZlICA9PiBpbXBsaWNhdGlvbnMgZm9yIGhvdyB0byBzZXQgdXAgZGVzaWduIG1hdHJpeCB0byBnZXQgcmlnaHQgY29lZmZpY2llbnRzCgoKLSBjb2VmZmljaWVudHM6IGZpdHRpbmcgbWVhbnMgLS0tIGZpdCBkaWZmZXJlbmNlIGluIG1lYW5zOiBrZWVwIGluIG1pbmQgcHJlZGljdGlvbnMKICAtIGxpbmsgdG8gZXF1YXRpb24gLSBkZXNpZ24gbWF0cml4CiAgLSBqdXN0IHZlY3RvciBvZiAxcwogIC0gdmVjdG9yIG9mIDJzCiAgICAtIGlmIHdlIGNoYW5nZSBzY2FsZSBvZiB4IHZhcmlhYmxlcywgc2NhbGUgb2YgYmV0YXMgY2hhbmdlczogeCB0aW1lcyBjLCB0aGVuIGJldGEgdGltZXMgMS9jCiAgLSB2ZWN0b3Igb2YgMXMsIHZlY3RvciwgMCwxCiAgLSB2ZWN0b3IgMCwxLCB2ZWN0b3IgMCwxCiAgLSB2ZWN0b3JzIG9mIDFzLCAwLDEsIDAsMSAobm9uIGluZGVwZW5kZW50KSA9PiBkdW1teSB2YXJzLCByZWZlcmVuY2UgY2F0ZWdvcmllcwoKICAKICAKICAtIGRpZmYgaW4gZGlmZjogCiAgICAtIGhvdyBjYW4gd2UgZ2V0IHJlZ3Jlc3Npb24gdG8gY2FsY3VsYXRlIHRoZSByZWxldmFudCBkaWZmZXJlbmNlcz8KICAgIC0gNCBjYXNlczogdHJlYXRlZCwgdW50cmVhdGVkLCBwcmUgcG9zdCAtLS0gaG93IG1hbnkgY29sdW1ucywgd2hhdCB2YWx1ZXM/CiAgCiAgbGVzc29uczogCiAgCiAgLSB3ZSBjYW4gZ2V0IGlkZW50aWNhbCBwcmVkaWN0aW9ucyBvZiB5LCBidXQgZGlmZmVyZW50IGNvZWZmaWNpZW50cyB3aXRoIGRpZmZlcmVudCBpbnRlcnByZXRhdGlvbnMgZGVwZW5kaW5nIG9uIGhvdyB3ZSBzZXQgdXAgZGVzaWduIG1hdHJpeCAoY2FuIGJlIHVzZWZ1bCkKICAtIGNhbiBnZXQgZ3JvdXAgbWVhbnMgZGlyZWN0bHkgZnJvbSBsZWFzdCBzcXVhcmVzCiAgCiAgCiAgLSBjb250aW51b3VzLCBubyBjb250cm9sczogYW5ncmlzdCBhbmQgcGlzY2hrZTogYml2YXJpYXRlLi4uIHdlaWdodGVkIGF2ZXJhZ2Ugb2YgZGVyaXZhdGl2ZSAod2VpZ2h0ZWQgYnkgd2hhdD8pLi4uIGludGVyY2VwdCBtZWFuaW5nCiAgICAtIGludGVyY2VwdCBtZWFuaW5nCiAgICAtIHdoYXQgaWYgd2UgY2VudGVyIHRoZSBjb250aW51b3VzIHZhcmlhYmxlICBhcm91bmQgaXRzIG1lYW4gYmVmb3JlIGhhbmQ/CiAgCmNvZWZmaWNpZW50cyB3aXRoIGNvbnRyb2xzOgoKICBkdW1teSB2YXJpYWJsZXM6IGlmIDAvMSBhbHdheXMgYWJsZSB0byBwcmV0dHkgY2xlYXJseSBpbnRlcnByZXQgZm9yIHNwZWNpZmljIGdyb3Vwczs7OyBpZiAwLzEgYXJlIG5vdCBleGNsdXNpdmUuLi4gd2hhdCBpcyBnb2luZyBvbj8KICAgIC0gZHVtbXkgdmFyaWFibGUgb3J0aG9nb25hbCB0byB0aGUgb3RoZXIgLS0tIHdoYXQgaGFwcGVucz8gaG93IHRvIGludGVycHJldAogICAgLSBjZW50ZXJlZCBhcm91bmQgdGhlIG1lYW4KICAgIC0gZ2VuZGVyLCBwcm9mZXNzaW9uLCBlYXJuaW5ncy4uIHBsdWcgaW4gbWVhbiBnZW5kZXIgZm9yIHByb2Zlc3Npb24uLi4uIHdoYXQgaXMgaW1wbGljYXRpb24gb2YgdGhpcy4uLgogICAgLSByZWNhbGwgdGhlIGNvZWZmaWNpZW50IGZvciBzbG9wZQogICAgLSB3ZWlnaHRpbmcgLS0tIHdoYXQgaWYgYWxsIGRvY3RvcnMgd2VyZSBmZW1hbGU/IHdoYXQgd2VpZ2h0IHdvdWxkIGdvIAogIAoKICAKICAgIAogIC0gY29uZGl0aW9uaW5nIG9uIGEgZHVtbXk6IHJhaW5mYWxsLi4uIHJlbGF0aW9uc2hpcCBiL3QgcmFpbmZhbGwgYW5kIGJyZXhpdCB2b3RlLi4uIGhvdyB0byBpbnRlcnByZXQuLi4gMCBjb3JyZWxhdGlvbiB3aXRoIGluZGljYXRvcnMgZm9yIHJlZ2lvbi4uLiByZXNpZHVhbCByYWluZmFsbC4uLiB3aGF0IGhhcHBlbnMgdGhlbj8KICAKICAtIGNvbnRpbnVub3VzOiBhZ2UsIGNvbmRpdGlvbmluZyBvbiBob3VycyB3b3JrZWQgKGNvbnRpbnVvdXMpIC0gd2hhdCBkb2VzIHRoaXMgbWVhbj8KICAgIC0gbWF0aDogb3J0aG9nb25hbCAtIDAgY29ycmVsYXRpb24KICAgIC0gaW50ZXJwcmV0YXRpb246IAogICAgLSByZWdyZXNzaW9uIHdlaWdodHMgaW5zaWdodAogICAgLSBpZiB0aGlzIGlzIHdyb25nLCB3aGF0IGNhbiB3ZSBkbz8KICAgIAogICAgICAtIGR1bW1pZXMgLSBhdCBldmVyeSBzaW5nbGUgdmFsdWUgb2YgeCAKICAgICAgCiAgICAtIHRyYW5zZm9ybWF0aW9uczogbG9nLCBzcXJ0LCBzcXVhcmVkIG9yIGhpZ2hyIHBvbHlub21pYWwKICAgICAgCiAgCiAgCiAgLSBsaW5lYXJpdHkgYW5kIHRyYW5zZm9ybWF0aW9ucy4gLS0tIAogICAgLSBsaW5lYXIgaW4gdGVybXMgCiAgICAtIHBvbHlub21pYWxzIGNhbiBwZXJmZWN0bHkgZml0IGRhdGEuLi4gYnV0Li4uCiAgICAgIC0gd2Ugbm8gcmlzayBvdmVyZml0dGluZyB0aGUgZGF0YQogICAgICAtIHdlIG5vIGxvbmdlciBnZXQgY2xlYXJseSBpbnRlcnByZXRhYmxlIGNvZWZmaWNpZW50cwogICAgLSAKICAgICAgCiAgICAgIAogICAgICAKICAgICAgCiAgICAgIAogICAgICAKICAtIGFkZGl0aXZpdHkgYW5kdHJhbnNmb3JtYXRpb25zCiAgICAtIGdlbmRlciB4IHByb2Zlc3Npb24sIGludGVyYWN0aW9uCiAgICAgIAogIA==