POLI 110

October 2, 2024

Objectives

(1) Concept Review

(2) Evaluating Descriptive Claims

(3) Variables

Levels of Measurement
Validity

Evaluating Descriptive Claims

Question/Claim \(\xrightarrow{}\)

Concept

transparent definition; which observable traits makes something an “X” \(\xrightarrow{}\)

Variable(s)

observable properties of cases that correspond to the concept \(\xrightarrow{}\)

Measure(s)

procedure to find the values variables take for specific cases \(\xrightarrow{}\)

“Answer”

“Countries around the world have become less democratic since 2016.”

We considered three concepts/definitions for democracy

A democracy is a country in which the word ‘democratic’ appears in the country name
A democracy is a government in which political decisions are made by people who acquire power through competitive elections, the results of which are respected (losers leave office).
A democracy is a government which is chosen through free and fair elections, where these elected officials shape policy and are not corrupt, where there is rule of law where people are equally free organize opposition and have a reasonable chance at winning changes, there is a free press, and where there are individual freedoms to protest, speech, religion/thought, etc.

For our purposes (scientifically evaluating descriptive claims), we only care if these concepts are defined using

traits that identify what it means to be in this category that we can all observe
(better concepts help us make better predictions, but we can set that aside in this course)

All three definitions we gave above work

data description

“Countries around the world have become less democratic since 2016.”

Different concepts lead to different variables, different measures, different answers.

Even if we disagree that these are definitions of “democracy”, we can still evaluate the claim using these definitions.

What if we defined democracy as “I know it when I see it”?
What if we defined democracy as “Countries where the government enacts the general will”?

Question/Claim

\(\not\xrightarrow{}\) Concepts not transparent/not well formulated

Concept

\(\not\xrightarrow{}\) Variable does not map onto concept

Variables

\(\not\xrightarrow{}\) Procedure does not return the true value

Measure(s)

\(\not\xrightarrow{}\)

“Answer”

Variables
and Measures

Variables and Measures:

variable(s):

A measurable property of cases that corresponds to a concept or part of a concept and can potentially take on different values across cases and time (it varies across cases).

something we could observe in principle
Chosen to indicate membership in category/presence of attribute (concepts)
Variables take on values for each case at a specific point in time
Variation across cases and/or over time.
General (e.g., “fraction of people legally eligible to vote”, not “fraction of people legally eligible to vote in 2021 Canadian Federal Elections”)

Variables and Measures:

measure(s)

A procedure for determining the value a variable takes for specific cases based on observation.

Measures are proposed to determine the value a variable takes for some cases
They are always for some specific cases we want to know about (e.g., a procedure for estimating the number of violent crimes committed in a given year.)

A Trivial Example:

A descriptive question:

What is the tallest mountain on the North Shore?

We need to:

define a the concept of “tall”
create a variable that matches that definition and is observable
develop a procedure to obtain values of that variable for mountains on the North Shore

Concept to Measurement:

Concept: tallness (of a mountain)

Elevation (distance from peak to sea level)

\(\xrightarrow{}\) Variable:

Vertical distance in meters from mean sea level to the top of the peak

\(\xrightarrow{}\) Measure:

Use difference in barometric pressure at Burrard Inlet and peak to calculate difference in elevation

Concept to Measurement:

Are you going to climb the mountain? Prominence might be a better concept of height.

Concept to Measurement:

Concept: tallness (of a mountain)

the elevation of a summit relative to the highest point to which one must descend before reascending to a higher summit

\(\xrightarrow{}\) Variable:

Vertical distance in meters from top of the peak to lowest contour line surrounding it and no other higher peaks.

\(\xrightarrow{}\) Measure:

Satellites using radar interferometry create topographical maps; computer algorithm to find lowest contour

Concept to Measurement:

Different concepts \(\to\) different variables

Different variables \(\to\) different measures

Different Answer:

Using elevation from sea level, West Lion is taller
Using prominence, Seymour is taller.

Variables

variables:

take on values for each specific case at a specific point in time.
- values vary between cases at the same point in time
- values vary within cases at different points in time
take different kinds of values

Variables: Example:

Variables: Example

Attempt to persuade: “There is an influx in violent migrant crime”

A claim: “13,099 convicted murderers have crossed the border and are free to roam and kill in [the United States]”

Concept(s): undocumented immigrants, convicted murderers, legal detention

Variable: number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody

If we were to go look at “number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody”

What kind of values could this variable take?

Variables: Example

number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody:

integer
must be greater or equal to \(0\)

Can you think of different variables (for transmissibility)?

what values could they take?

e.g. “convicted murderers per capita among undocumented migrants”

Variables: Example

“The immigration status of a person.”

This is a variable: what values does it take?

Categories (e.g., natural born citizen, naturalized citizen, visa holder, permanent resident, undocumented, etc.)
No numeric range
One immigration status is not “higher”/“lower” than another.

Variables

The kinds of values taken by a variable is called its level of measurement

Four levels of measurement

Nominal
Ordinal
Interval
Ratio

Not to be confused with measures

Levels of measurement: nominal

nominal levels of measurement:

place cases into unranked categories
discrete groups based on presence/absence of attribute(s)
no category is “more” or “less” than another
categories are exhaustive (every case can fit in a category)
- sometimes we just have “other”

Examples:

Religion
Partisan affiliation
Regime type (e.g. minimalist democracy vs non-democracy)
Type of crime (e.g. murder vs. assault vs. burglary etc.)

Levels of measurement: ordinal

ordinal levels of measurement

place cases into categories that are ranked
may have a number attached (or not)
Cases can be said to have more or less of something
Intervals between categories not meaningful
relative levels, not absolute levels

Examples:

University rankings
Test score percentiles
Ideology (very liberal, somewhat liberal, neither, somewhat conservative, very conservative)
Level of democracy

Levels of measurement: interval

interval levels of measurement

assign cases numbers that rank the cases
have intervals between values are meaningful and consistent (1 unit change is the same size each time)
difference in values indicates how much more or less of something case has from another
no meaningful zero point, so ratios not meaningful

Examples

Year (but not years since some event)
Temperature (in Celsius, but not Kelvin)
Date of first COVID-19 vaccination

Levels of measurement: ratio

ratio levels of measurement

assign cases numbers that rank the cases
have intervals that are meaningful and consistent
difference in values indicates how much more or less
zero indicates absence
ratios meaningful (something can be twice as much as something else)

Examples

Time since some event
Counts of events
Rates (unemployment, language spoken, political party preference)

Example: Gun Violence

What is the level of measurement?

Cause of death
Number of gun deaths
Change over time in number of gun deaths
Proportion of all murders that involve guns
Strictness of gun ownership regulations
Year in which a country bans assault weapons

Discuss with your neighbors

Example: Gun Violence

What is the level of measurement?

Cause of death? (nominal)
Number of gun deaths? (ratio)
Change over time in number of gun deaths (ratio)
Proportion of all murders that involve guns (ratio)
Strictness of gun ownership regulations (ordinal)
Year in which a country bans assault weapons (interval)

Types of Variables

Variables also give values in absolute and relative units.

absolute values are counts given in raw units

Examples: dollar amounts, Number of events, Number of deaths

relative values are given in fractions or rates, ranks, percentage change

Examples:

Units are fractional (GDP per capita, deaths/population, events/time)
No units (ordinal rankings)

descriptive claims:

claims about what exists (or has existed/will exist) in the world:

what kinds of things exist? (nominal)
what is the type of this case? (nominal)
how much of something is there? (ordinal, interval, ratio)
how much of something is there here vs. there/now vs. then? (ordinal, interval , ratio)
what patterns are there in the co-occurrence of different phenomena (depends)

Types of Variables

Suppose we want to know whether this is true: Canada has less gun violence than the United States

Which variable would be best?

Number of gun deaths
Number of murders per capita
Number of murders using guns per capita

Variables: Example

A claim: “13,099 convicted murderers have crossed the border and are free to roam and kill in [the United States]”

Concept(s): undocumented immigrants, convicted murderers, legal detention

Variable: number of non-citizens with criminal convictions facing deportation proceedings but who are not held in immigration agency custody

What could go wrong here?
non-citizens include legal migrants, ignores detention by other agencies, ignores relative propensity of citizens, non-citizens to murder

Validity

Variables can fail:

Even if we develop a useful concept…

variables may not correspond to the concept

It doesn’t mean the claim is wrong, but it means that evidence is potentially irrelevant.

We have to develop variables that better match the concepts in the claim.

Variable Trouble: Validity

validity:

Degree of fit between a variables the concept the variable is intended to capture.
When a variable “captures” or “maps onto” the concept we are interested in, then we say they have “validity”
When a variable “captures” or “maps onto” other concepts we are not interested in, then we say they lack “validity”
How to know we are talking about validity problems: even if we are able to perfectly observe (measure) something, it doesn’t speak to the claim.

Variable Trouble: Validity

“Which country/province is most politically corrupt?”

Concept: Political Corruption or “the use of power by government officials for illegitimate private gain”

Variable: Fraction of political officeholders in a place prosecuted for corruption

Measure: Match criminal court defendants in corruption prosecutions to list of politicians.

Does this variable have validity? (Does it correspond only to the intended concept?)

“Which country/province is most politically corrupt?”

Concept: Political Corruption or “the use of power by government officials for illegitimate private gain”

Variable: Fraction of political officeholders in a place prosecuted for corruption

Measure: Match criminal court defendants in corruption prosecutions to list of politicians.

Problems

Places with lots of corruption do not prosecute corruption
Places with low corruption successfully prosecute corruption
Places have different corruption laws.
But the measure may give correct values for the variable

Variable Trouble: Validity

Light et al 2020 investigate claim: “undocumented migrants are prone to violent crime”

concepts: undocumented, violent criminals

variable: arrest rates for violent crime (homicide, assault, robbery, sexual assault) for US-born citizens, legal immigrants, undocumented immigrants

measure: crimes listed in arrests in the Texas Computerized Criminal History database, immigration status as determined by DHS and ICE using biometrics database, estimates of undocumented migrants using Census data

Variable Trouble: Validity

Light et al 2020 investigate claim: “undocumented migrants are prone to violent crime”

concepts: undocumented, violent criminals

variable: arrest rates for violent crime (homicide, assault, robbery, sexual assault) for US-born citizens, legal immigrants, undocumented immigrants

arrests \(\neq\) convictions \(\neq\) actual crimes
many violent crimes go “unsolved”

Conclusion

Distinguish between Variables and Measures
Variables:
- levels of measurement
- absolute/relative
Validity
- what is it?
- why do variables not have it?