(1) Recap
- Evaluating Descriptive Claims
- Variables vs. Measures
- Validity
(2) Measurement Error
- Bias
- Random
October 7, 2024
We want our evidence to:
We want to be sensitive to:
\(\to\) Concepts
\(\to\) Variables
\(\to\) Measures
Where are we at risk of failing weak severity? Where are the failure points (assumptions) we can check?
Concepts not transparent/systematic \(\xrightarrow{\xcancel{weak \ severity}}\)
Variable does not map onto concept (lack of validity) \(\xrightarrow{\xcancel{weak \ severity}}\) or \(\xrightarrow{\xcancel{pass \ severe \ test}}\)
Procedure does not return the true value (measurement error) \(\xrightarrow{\xcancel{pass \ severe \ test}}\)
Groups attempting to mobilize support/transform policy capitalize on highly publicized events.
They often use individual events to diagnose a “problem” they propose to solve.
Whether this is effective or not or their claims are true or not, picking individual events is a form of evidence that lacks severity.
For every 10,000 black people arrested for violent crime, 3 are killed
— Leonydus Johnson (leave/me/alone) (@LeonydusJohnson) June 1, 2020
For every 10,000 white people arrested for violent crime, 4 are killed
I'm going to keep tweeting this until someone can explain to me how this is possible if there is truly pervasive racial bias in policing
Claim: “Racial bias in policing is not pervasive”
Concept: racial bias in policing: disparity in police use of force in excess of “reasonable” considerations such as objective threat posed by suspect
Variable: difference by race in number of people killed by police per persons arrested for violent crimes
Measure: count press-reported police-shootings by race, FBI data on arrests by crime-type and race
(Or we could ask, do we have to assume something in order to believe it is valid?)
Claim: “Illegal immigrants commit murder at higher rates”
Concept: “illegal immigrant”, “murder”
Variable: correlation between fraction of people in a city who are undocumented immigrants, fraction of people in a city who are murderers
Measure: Pew Research estimate of undocumented migrants, FBI data on murders per capita
One type of validity problem:
Whenever we can make the case that the variable—what we intend to observe—(before we EVER collect any data) does not match the concept, then there is a lack of validity
Light et al 2020 investigate claim: “undocumented migrants are prone to violent crime”
concepts: undocumented, violent criminals
variable: conviction rates for violent crime (homicide, assault, robbery, sexual assault) for US-born citizens, legal immigrants, undocumented immigrants
measure: crimes listed in arrests in the Texas Computerized Criminal History database, immigration status as determined by DHS and ICE using biometrics database, estimates of undocumented migrants using Census data
Not the end of the story: Kennedy et al at Center for Immigration Studies dispute these findings:
Complaints focus on:
That is, the procedures by which we observe the variable “murders committed by undocumented migrants”.
Kennedy et al, argue:
It takes time for undocumented immigrants in custody to be identified.
Only people in custody for longer periods of time for serious crimes likely to be thoroughly checked:
About how many cases among people between 70 and 80?
About how many cases among people between 20 and 30?
is a difference between the observed value of a variable for a case (produced by the measurement procedure) and the true value of the variable for that case.
\[\mathrm{Value}_{observed} - \mathrm{Value}_{true} \neq 0 \xrightarrow{then} \mathrm{measurement \ error}\]
If what we observe is different from the true value for a case (difference is not 0), then there is measurement ERROR
What is the incidence of sexual misconduct defined here at UBC?
Let’s say a variable is the number of breaches of Sexual Misconduct Policy in a given year.
Measure: Reporting from the UBC Investigations Office.
What is the incidence of sexual misconduct defined here at UBC?
Let’s say a variable is the number of breaches of Sexual Misconduct Policy in a given year.
Measure: Reporting from the UBC Investigations Office.
That implies \(15\) incidents in 2022-2023 Academic Year (last available data). 64 reports \(\to\) 39 investigations \(\to\) 24 completed investigations \(\to\) 15 breaches found
\[\mathrm{Sexual \ Misconduct }_{observed} - \mathrm{Sexual \ Misconduct}_{true} \neq 0\]
\[\xrightarrow{then} \mathrm{measurement \ error}\]
Differ in the patterns of \(\mathrm{Value}_{observed} - \mathrm{Value}_{true}\) that we see.
Measures may suffer from both.
bias or systematic measurement error: error produced when our measurement procedure obtains values that are, on average, too high or too low (or, incorrect) compared to the truth.
Kennedy et al argue that Light et al’s measurement procedures lead to:
\[\mathrm{Migrant \ Homicide \ Rate }_{observed} - \mathrm{Migrant \ Homicide \ Rate}_{true} < 0\]
\[\xrightarrow{then} \mathrm{measurement \ bias}\]
Though, this debate over measurement isn’t over.
bias different in different subgroups
random measurement error: errors that occur due to random features of measurement process or phenomenon. So even if observed values are sometimes wrong, they are, on average, correct
Variable: relative change in COVID-19 infections
Measure: “Composite wastewater influent is collected over a 24-hour period from wastewater treatment plants (WWTPs). Samples are collected 2-3x per week at each WWTP and are transported by the BCCDC PHL for analysis. Wastewater samples are concentrated by ultracentrifugal filtration, nucleic acids extracted and SARS-CoV-2 envelope gene (E gene) is detected by real-time quantitative polymerase chain reaction (RT-qPCR).”
Day-to-day variation in:
can lead to errors in measurement, but these errors likely cancel out in the long run.
Go to menti.com/ and enter the code \(59 \ 93 \ 07 \ 9\)