October 2, 2024

Objectives

(1) Concept Review

(2) Evaluating Descriptive Claims

(3) Variables

  • Levels of Measurement
  • Validity

Evaluating Descriptive Claims

Question/Claim \(\xrightarrow{}\)

Concept

  • transparent definition; which observable traits makes something an “X” \(\xrightarrow{}\)

Variable(s)

  • observable properties of cases that correspond to the concept \(\xrightarrow{}\)

Measure(s)

  • procedure to find the values variables take for specific cases \(\xrightarrow{}\)

“Answer”

“Countries around the world have become less democratic since 2016.”

We considered three concepts/definitions for democracy

  1. A democracy is a country in which the word ‘democratic’ appears in the country name
  2. A democracy is a government in which political decisions are made by people who acquire power through competitive elections, the results of which are respected (losers leave office).
  3. A democracy is a government which is chosen through free and fair elections, where these elected officials shape policy and are not corrupt, where there is rule of law where people are equally free organize opposition and have a reasonable chance at winning changes, there is a free press, and where there are individual freedoms to protest, speech, religion/thought, etc.

For our purposes (scientifically evaluating descriptive claims), we only care if these concepts are defined using

  • traits that identify what it means to be in this category that we can all observe
  • (better concepts help us make better predictions, but we can set that aside in this course)
  • All three definitions we gave above work

“Countries around the world have become less democratic since 2016.”

Different concepts lead to different variables, different measures, different answers.

Even if we disagree that these are definitions of “democracy”, we can still evaluate the claim using these definitions.

  • What if we defined democracy as “I know it when I see it”?
  • What if we defined democracy as “Countries where the government enacts the general will”?

Question/Claim

\(\not\xrightarrow{}\) Concepts not transparent/not well formulated

Concept

\(\not\xrightarrow{}\) Variable does not map onto concept

Variables

\(\not\xrightarrow{}\) Procedure does not return the true value

Measure(s)

\(\not\xrightarrow{}\)

“Answer”

Variables
and Measures

Variables and Measures:

variable(s):

A measurable property of cases that corresponds to a concept or part of a concept and can potentially take on different values across cases and time (it varies across cases).

  • something we could observe in principle
  • Chosen to indicate membership in category/presence of attribute (concepts)
  • Variables take on values for each case at a specific point in time
  • Variation across cases and/or over time.
  • General (e.g., “fraction of people legally eligible to vote”, not “fraction of people legally eligible to vote in 2021 Canadian Federal Elections”)

Variables and Measures:

measure(s)

A procedure for determining the value a variable takes for specific cases based on observation.

  • Measures are proposed to determine the value a variable takes for some cases
  • They are always for some specific cases we want to know about (e.g., a procedure for estimating the number of violent crimes committed in a given year.)

A Trivial Example:

A descriptive question:

What is the tallest mountain on the North Shore?

We need to:

  • define a the concept of “tall”
  • create a variable that matches that definition and is observable
  • develop a procedure to obtain values of that variable for mountains on the North Shore

Concept to Measurement:

Concept: tallness (of a mountain)

Elevation (distance from peak to sea level)

\(\xrightarrow{}\) Variable:

Vertical distance in meters from mean sea level to the top of the peak

\(\xrightarrow{}\) Measure:

Use difference in barometric pressure at Burrard Inlet and peak to calculate difference in elevation

Concept to Measurement:

Are you going to climb the mountain? Prominence might be a better concept of height.

Concept to Measurement:

Concept: tallness (of a mountain)

the elevation of a summit relative to the highest point to which one must descend before reascending to a higher summit

\(\xrightarrow{}\) Variable:

Vertical distance in meters from top of the peak to lowest contour line surrounding it and no other higher peaks.

\(\xrightarrow{}\) Measure:

Satellites using radar interferometry create topographical maps; computer algorithm to find lowest contour

Concept to Measurement:

Different concepts \(\to\) different variables


Different variables \(\to\) different measures


Different Answer:

  • Using elevation from sea level, West Lion is taller
  • Using prominence, Seymour is taller.

Variables

Variables

variables:

  • take on values for each specific case at a specific point in time.
    • values vary between cases at the same point in time
    • values vary within cases at different points in time
  • take different kinds of values

Variables: Example:

Variables: Example

Attempt to persuade: “There is an influx in violent migrant crime”

A claim: “13,099 convicted murderers have crossed the border and are free to roam and kill in [the United States]”

Concept(s): undocumented immigrants, convicted murderers, legal detention

Variable: number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody

If we were to go look at “number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody”

What kind of values could this variable take?

Variables: Example

number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody:

  • integer
  • must be greater or equal to \(0\)

Can you think of different variables (for transmissibility)?

  • what values could they take?
  • e.g. “convicted murderers per capita among undocumented migrants”

Variables: Example

“The immigration status of a person.”

This is a variable: what values does it take?

  • Categories (e.g., natural born citizen, naturalized citizen, visa holder, permanent resident, undocumented, etc.)
  • No numeric range
  • One immigration status is not “higher”/“lower” than another.

Variables

The kinds of values taken by a variable is called its level of measurement

Four levels of measurement


  • Nominal
  • Ordinal
  • Interval
  • Ratio

Not to be confused with measures

Levels of measurement: nominal

nominal levels of measurement:

  • place cases into unranked categories
  • discrete groups based on presence/absence of attribute(s)
  • no category is “more” or “less” than another
  • categories are exhaustive (every case can fit in a category)
    • sometimes we just have “other”

Examples:

  • Religion
  • Partisan affiliation
  • Regime type (e.g. minimalist democracy vs non-democracy)
  • Type of crime (e.g. murder vs. assault vs. burglary etc.)

Levels of measurement: ordinal

ordinal levels of measurement

  • place cases into categories that are ranked
  • may have a number attached (or not)
  • Cases can be said to have more or less of something
  • Intervals between categories not meaningful
  • relative levels, not absolute levels

Examples:

  • University rankings
  • Test score percentiles
  • Ideology (very liberal, somewhat liberal, neither, somewhat conservative, very conservative)
  • Level of democracy

Levels of measurement: interval

interval levels of measurement

  • assign cases numbers that rank the cases
  • have intervals between values are meaningful and consistent (1 unit change is the same size each time)
  • difference in values indicates how much more or less of something case has from another
  • no meaningful zero point, so ratios not meaningful

Examples

  • Year (but not years since some event)
  • Temperature (in Celsius, but not Kelvin)
  • Date of first COVID-19 vaccination

Levels of measurement: ratio

ratio levels of measurement

  • assign cases numbers that rank the cases
  • have intervals that are meaningful and consistent
  • difference in values indicates how much more or less
  • zero indicates absence
  • ratios meaningful (something can be twice as much as something else)

Examples

  • Time since some event
  • Counts of events
  • Rates (unemployment, language spoken, political party preference)

Example: Gun Violence

What is the level of measurement?

  1. Cause of death
  2. Number of gun deaths
  3. Change over time in number of gun deaths
  4. Proportion of all murders that involve guns
  5. Strictness of gun ownership regulations
  6. Year in which a country bans assault weapons

Discuss with your neighbors

Example: Gun Violence

What is the level of measurement?

  1. Cause of death? (nominal)
  2. Number of gun deaths? (ratio)
  3. Change over time in number of gun deaths (ratio)
  4. Proportion of all murders that involve guns (ratio)
  5. Strictness of gun ownership regulations (ordinal)
  6. Year in which a country bans assault weapons (interval)

Types of Variables

Variables also give values in absolute and relative units.

absolute values are counts given in raw units

Examples: dollar amounts, Number of events, Number of deaths

relative values are given in fractions or rates, ranks, percentage change

Examples:

  • Units are fractional (GDP per capita, deaths/population, events/time)
  • No units (ordinal rankings)

descriptive claims:

claims about what exists (or has existed/will exist) in the world:

  • what kinds of things exist? (nominal)
  • what is the type of this case? (nominal)
  • how much of something is there? (ordinal, interval, ratio)
  • how much of something is there here vs. there/now vs. then? (ordinal, interval , ratio)
  • what patterns are there in the co-occurrence of different phenomena (depends)

Types of Variables

Suppose we want to know whether this is true: Canada has less gun violence than the United States


Which variable would be best?

  1. Number of gun deaths
  2. Number of murders per capita
  3. Number of murders using guns per capita

Variables: Example

A claim: “13,099 convicted murderers have crossed the border and are free to roam and kill in [the United States]”

Concept(s): undocumented immigrants, convicted murderers, legal detention

Variable: number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody

  • What could go wrong here?
  • non-citizens include legal migrants, ignores detention by other agencies, ignores relative propensity of citizens, non-citizens to murder

Validity

Variables can fail:

Even if we develop a useful concept…

variables may not correspond to the concept

It doesn’t mean the claim is wrong, but it means that evidence is potentially irrelevant.

We have to develop variables that better match the concepts in the claim.

Variable Trouble: Validity

validity:

  • Degree of fit between a variables the concept the variable is intended to capture.
  • When a variable “captures” or “maps onto” the concept we are interested in, then we say they have “validity”
  • When a variable “captures” or “maps onto” other concepts we are not interested in, then we say they lack “validity”
  • How to know we are talking about validity problems: even if we are able to perfectly observe (measure) something, it doesn’t speak to the claim.

Variable Trouble: Validity

“Which country/province is most politically corrupt?”

Concept: Political Corruption or “the use of power by government officials for illegitimate private gain”

Variable: Fraction of political officeholders in a place prosecuted for corruption

Measure: Match criminal court defendants in corruption prosecutions to list of politicians.

Does this variable have validity? (Does it correspond only to the intended concept?)

“Which country/province is most politically corrupt?”

Concept: Political Corruption or “the use of power by government officials for illegitimate private gain”

Variable: Fraction of political officeholders in a place prosecuted for corruption

Measure: Match criminal court defendants in corruption prosecutions to list of politicians.

Problems

  • Places with lots of corruption do not prosecute corruption
  • Places with low corruption successfully prosecute corruption
  • Places have different corruption laws.
  • But the measure may give correct values for the variable

Variable Trouble: Validity

Light et al 2020 investigate claim: “undocumented migrants are prone to violent crime”

concepts: undocumented, violent criminals

variable: arrest rates for violent crime (homicide, assault, robbery, sexual assault) for US-born citizens, legal immigrants, undocumented immigrants

measure: crimes listed in arrests in the Texas Computerized Criminal History database, immigration status as determined by DHS and ICE using biometrics database, estimates of undocumented migrants using Census data

Variable Trouble: Validity

Light et al 2020 investigate claim: “undocumented migrants are prone to violent crime”

concepts: undocumented, violent criminals

variable: arrest rates for violent crime (homicide, assault, robbery, sexual assault) for US-born citizens, legal immigrants, undocumented immigrants

measure: crimes listed in arrests in the Texas Computerized Criminal History database, immigration status as determined by DHS and ICE using biometrics database, estimates of undocumented migrants using Census data

  • arrests \(\neq\) convictions \(\neq\) actual crimes
  • many violent crimes go “unsolved”

Conclusion

  1. Distinguish between Variables and Measures

  2. Variables:

    • levels of measurement
    • absolute/relative
  3. Validity

    • what is it?
    • why do variables not have it?