POLI 110

October 7, 2024

Objectives

(1) Recap

Evaluating Descriptive Claims
Variables vs. Measures
Validity

(2) Measurement Error

Bias
Random

Recap

Severity and Evidence for Descriptive Claims

We want our evidence to:

be capable of showing claim to be wrong (weak severity)
stand up to multiple checks on where it could be wrong (strong severity)

We want to be sensitive to:

what properties of evidence \(\to\) absence of weak severity
what are the multiple failure points (assumptions) that we can check \(\to\) strong severity

Descriptive Claim

\(\to\) Concepts

\(\to\) Variables

\(\to\) Measures

\(\to\) Answer

Where are we at risk of failing weak severity? Where are the failure points (assumptions) we can check?

Descriptive Claim

Concepts not transparent/systematic \(\xrightarrow{\xcancel{weak \ severity}}\)

Variable does not map onto concept (lack of validity) \(\xrightarrow{\xcancel{weak \ severity}}\) or \(\xrightarrow{\xcancel{pass \ severe \ test}}\)

Procedure does not return the true value (measurement error) \(\xrightarrow{\xcancel{pass \ severe \ test}}\)

\(\xcancel{\to}\) “Answer”

Strategies of Political Activists:

Groups attempting to mobilize support/transform policy capitalize on highly publicized events.
They often use individual events to diagnose a “problem” they propose to solve.

Whether this is effective or not or their claims are true or not, picking individual events is a form of evidence that lacks severity.

For every 10,000 black people arrested for violent crime, 3 are killed

For every 10,000 white people arrested for violent crime, 4 are killed

I'm going to keep tweeting this until someone can explain to me how this is possible if there is truly pervasive racial bias in policing
— Leonydus Johnson (leave/me/alone) (@LeonydusJohnson) June 1, 2020

Validity

Claim: “Racial bias in policing is not pervasive”

Concept: racial bias in policing: disparity in police use of force in excess of “reasonable” considerations such as objective threat posed by suspect

Variable: difference by race in number of people killed by police per persons arrested for violent crimes

Measure: count press-reported police-shootings by race, FBI data on arrests by crime-type and race

Discuss: Does this variable have validity?

(Or we could ask, do we have to assume something in order to believe it is valid?)

Board

Validity

Claim: “Illegal immigrants commit murder at higher rates”

Concept: “illegal immigrant”, “murder”

Variable: correlation between fraction of people in a city who are undocumented immigrants, fraction of people in a city who are murderers

Measure: Pew Research estimate of undocumented migrants, FBI data on murders per capita

Discuss: Is this a “severe” test of the claim? Or are there possible problems?

We can’t say, based on this.

One type of validity problem:

Ecological Fallacy

using variables that observe behavior at aggregate levels (e.g. country, province, city) and making inferences about individual behaviors.
even if we have perfect data on aggregate variables (e.g. undocumented migrants, murderers), can still draw incorrect conclusions
not wrong if assumptions correct: assumes that behaviors (e.g. murder) of individuals of different “types” (undocumented migrant, everyone else) are the same regardless of aggregate context (proportion that is undocumented).

Validity

Whenever we can make the case that the variable—what we intend to observe—(before we EVER collect any data) does not match the concept, then there is a lack of validity

Example

Undocumented Migrants and Murder

Light et al 2020 investigate claim: “undocumented migrants are prone to violent crime”

concepts: undocumented, violent criminals

variable: conviction rates for violent crime (homicide, assault, robbery, sexual assault) for US-born citizens, legal immigrants, undocumented immigrants

measure: crimes listed in arrests in the Texas Computerized Criminal History database, immigration status as determined by DHS and ICE using biometrics database, estimates of undocumented migrants using Census data

Not the end of the story: Kennedy et al at Center for Immigration Studies dispute these findings:

Complaints focus on:

how does Texas identify undocumented migrants in arrest and prison records?
how do DHS/ICE identify undocumented migrants?

That is, the procedures by which we observe the variable “murders committed by undocumented migrants”.

Kennedy et al, argue:

It takes time for undocumented immigrants in custody to be identified.
- \(\to\) undercounting of arrested undocumented
- \(\to\) more undercounting in recently arrested
Only people in custody for longer periods of time for serious crimes likely to be thoroughly checked:
- \(\to\) undercounting is lower/minimal for homicide convictions
- \(\to\) need to use DHS and Texas prison (TDCJ) checks on migration status

Argue that “fixing” measurement problems, conclusions reversed.

Measurement Error

Digression: Histograms

video explanation here

About how many cases among people between 70 and 80?

About how many cases among people between 20 and 30?

Measurement Error

Validity is about link between variable and concept

Measurement Error is about link between measure and variable.

Measurement Error

measurement error

is a difference between the observed value of a variable for a case (produced by the measurement procedure) and the true value of the variable for that case.

\[\mathrm{Value}_{observed} - \mathrm{Value}_{true} \neq 0 \xrightarrow{then} \mathrm{measurement \ error}\]

If what we observe is different from the true value for a case (difference is not 0), then there is measurement ERROR

Measurement Error

What is the incidence of sexual misconduct defined here at UBC?

Let’s say a variable is the number of breaches of Sexual Misconduct Policy in a given year.

Measure: Reporting from the UBC Investigations Office.

Measurement Error

What is the incidence of sexual misconduct defined here at UBC?

Let’s say a variable is the number of breaches of Sexual Misconduct Policy in a given year.

Measure: Reporting from the UBC Investigations Office.

That implies \(15\) incidents in 2022-2023 Academic Year (last available data). 64 reports \(\to\) 39 investigations \(\to\) 24 completed investigations \(\to\) 15 breaches found

Is this observed value too high? too low? correct? Why?

Measurement Error

\[\mathrm{Sexual \ Misconduct }_{observed} - \mathrm{Sexual \ Misconduct}_{true} \neq 0\]

\[\xrightarrow{then} \mathrm{measurement \ error}\]

Most likely \(\mathrm{Sexual \ Misconduct }_{observed} - \mathrm{Sexual \ Misconduct}_{true} < 0\)

Two varieties of measurement error

bias/systematic measurement error
random measurement error

Differ in the patterns of \(\mathrm{Value}_{observed} - \mathrm{Value}_{true}\) that we see.

Measures may suffer from both.

Measurement Error: Bias

bias or systematic measurement error: error produced when our measurement procedure obtains values that are, on average, too high or too low (or, incorrect) compared to the truth.

Key phrase is “on average”: error is not a one-off fluke, will happen systematically even if you repeat the measurement procedure.
can have an upward (observed value too high) or downward (observed value too low) bias
not “politically” biased
bias might not be the same for all cases or different across subgroups
- example: economic evaluations and partisanship in surveys

Measurement Error

Kennedy et al argue that Light et al’s measurement procedures lead to:

\[\mathrm{Migrant \ Homicide \ Rate }_{observed} - \mathrm{Migrant \ Homicide \ Rate}_{true} < 0\]

\[\xrightarrow{then} \mathrm{measurement \ bias}\]

Though, this debate over measurement isn’t over.

bias different in different subgroups

Measurement Error: Random

random measurement error: errors that occur due to random features of measurement process or phenomenon. So even if observed values are sometimes wrong, they are, on average, correct

Due to chance, we get values that are too high or too low
May be lots of idiosyncratic errors
There is no tilt one way or another (no bias)
In aggregate, values that are “too high” are balanced out by values that are “too low” compared to the truth

Measurement Error: Random

Variable: relative change in COVID-19 infections

Measure: “Composite wastewater influent is collected over a 24-hour period from wastewater treatment plants (WWTPs). Samples are collected 2-3x per week at each WWTP and are transported by the BCCDC PHL for analysis. Wastewater samples are concentrated by ultracentrifugal filtration, nucleic acids extracted and SARS-CoV-2 envelope gene (E gene) is detected by real-time quantitative polymerase chain reaction (RT-qPCR).”

Measurement Error: Random

Day-to-day variation in:

wastewater volume (e.g. rain, snowmelt, showering)
fecal matter (e.g. diet, exercise, other diseases)

can lead to errors in measurement, but these errors likely cancel out in the long run.

Practice

Go to menti.com/ and enter the code \(59 \ 93 \ 07 \ 9\)

Conclusion:

Measurement Error

Bias/systematic measurement error
Random measurement error

What is “bias”?
How do you recognize these?
We can have both.