February 4, 2021

## Objectives

### (2) Variables

• Levels of Measurement
• Validity

## Evaluating Descriptive Claims

### Concept

• transparent definition; which observable traits makes something an “X” $$\xrightarrow{}$$

### Variable(s)

• observable properties of cases that correspond to the concept $$\xrightarrow{}$$

### Measure(s)

• procedure to find the values variables take for specific cases $$\xrightarrow{}$$

### Question/Claim

$$\not\xrightarrow{}$$ Concepts not transparent/not well formulated

### Concept

$$\not\xrightarrow{}$$ Variable does not map onto concept

### Variables

$$\not\xrightarrow{}$$ Procedure does not return the true value

### Measure(s)

$$\not\xrightarrow{}$$

## Variables and Measures:

### variable(s):

A measurable property of a case that corresponds to a concept or part of a concept and can potentially take on different values across cases and time (it varies across cases).

• Chosen to indicate (degree) of membership to a concept
• Variables take on values for each case at a specific point in time
• Variation across cases or over time.
• General (e.g., “number of COVID-19 deaths”, not “number of COVID-19 deaths in BC since March 2020”)

## Variables and Measures:

### measure(s)

A procedure for determining the value a variable takes for specific cases based on observation.

• Measures are proposed to determine the value a variable takes for some cases
• They are always for some specific cases we want to know about (e.g., a procedure for estimating COVID-19 deaths in BC since March 2020)

## A Trivial Example:

### A descriptive question:

What is the tallest mountain on the North Shore?

### We need to:

• define a the concept of “tall”/“height”
• create a variable that matches that definition and is observable
• develop a procedure to obtain values of that variable for mountains on the North Shore

## Concept to Measurement:

### Concept: Height (of a mountain)

Elevation of peak above sea level

### $$\xrightarrow{}$$ Variable:

Vertical distance in meters from the Burrard Inlet to the top of the peak

### $$\xrightarrow{}$$ Measure:

Use difference in barometric pressure to calculate difference in elevation

## Concept to Measurement:

Are you going to climb the mountain? Prominence might be a better concept of height.

## Concept to Measurement:

### Concept: Height (of a mountain)

Prominence of peak compared to nearby mountains

### $$\xrightarrow{}$$ Variable:

Vertical distance in meters from top of the peak to lowest contour line surrounding it and no other higher peaks.

### $$\xrightarrow{}$$ Measure:

Satellites using radar interferometry create topographical maps; computer algorithm to find lowest contour

## Concept to Measurement:

Different concepts $$\to$$ different variables

Different variables $$\to$$ different measures

• Using elevation from sea level, West Lion is taller
• Using prominence, Seymour is taller.

## Variables

variables:

• take on values for each specific case at a specific point in time.
• values vary between cases at the same point in time
• values vary within cases at different points in time
• take different kinds of values

## Variables: Example

A question: Was COVID-19 transmission higher after communities were the site of an in-person campaign rally by Donald Trump? see here

Concept: New COVID-19 infections

Variable: Number of new COVID-19 infections per week.

If we were to go look at a bunch of communities in the US that had Trump rallies in 2020, and try to measure “Number of new COVID-19 infections per week”…

What values could this variable take?

## Variables: Example

Number of new COVID-19 infections per week:

• integer
• must be greater or equal to $$0$$

Different variables:

• New COVID-19 infections Per Capita, per week
• $$R_0$$: expected number of new cases generated by one case

## Variables: Example

“The first language learned by a person.” (mother tongue)

This is a variable: what values does it take?

• Categories
• No numeric range
• One language is not “higher”/“lower” than another.

## Variables

The kinds of values taken by a variable is called its level of measurement

• Nominal
• Ordinal
• Interval
• Ratio

## Levels of measurement: nominal

nominal levels of measurement:

• place cases into unranked categories
• discrete groups based on presence/absence of attribute(s)
• no category is “more” or “less” than another
• categories are exhaustive (every case can fit in a category)
• sometimes we just have “other”

### Examples:

• Religion
• Pastisan affiliation
• Regime type (e.g. minimalist democracy vs non-democracy)
• Type of crime (e.g. hate vs. economic vs. personal etc.)

## Levels of measurement: ordinal

ordinal levels of measurement

• place cases into categories that are ranked
• may have a number attached (or not)
• Cases can be said to have more or less of something
• Intervals between categories not meaningful
• relative levels, not absolute levels

### Examples:

• University rankings
• Test score percentiles
• Ideology (very liberal, somewhat liberal, neither, somewhat conservative, very conservative)

## Levels of measurement: interval

interval levels of measurement

• assign cases numbers that rank the cases
• have intervals between values are meaningful and consistent (1 unit change is the same size each time)
• difference in values indicates how much more or less of something case has from another
• no meaningful zero point, ratios not meaningful

### Examples

• Years (but not years since some event)
• Temperature (in Celsius, but not Kelvin)
• Date of first COVID-19 vaccination

## Levels of measurement: ratio

ratio levels of measurement

• assign cases numbers that rank the cases
• have intervals that are meaningful and consistent
• difference in values indicates how much more or less
• zero indicates absence
• ratios meaningful (something can be twice as much as something else)

### Examples

• Years since some event
• Counts of events
• Rates (unemployment, language spoken, political party preference)

## Example: Gun Violence

What is the level of measurement?

1. Cause of death
2. Number of gun deaths
3. Change over time in number of gun deaths
4. Proportion of all murders that involve guns
5. Strictness of gun ownership regulations
6. Year in which a country bans assault weapons

POLL

## Example: Gun Violence

What is the level of measurement?

1. Cause of death? (nominal)
2. Number of gun deaths? (ratio)
3. Change over time in number of gun deaths (ratio)
4. Proportion of all murders that involve guns (ratio)
5. Strictness of gun ownership regulations (ordinal)
6. Year in which a country bans assault weapons (interval)

## Types of Variables

Variables also give values in absolute and relative units.

#### absolute values are counts given in raw units

Examples: dollar amounts, Number of events, Number of deaths

#### relative values are given in fractions or rates or ranks

Examples:

• Units are fractional (GDP per capita, deaths/population, events/time)
• No units (ordinal rankings)

## Types of Variables

### Which variable would be best?

1. Number of gun deaths in the preceding year
2. Rank of the state on a list of states ordered by gun deaths in the preceding year
3. Gun deaths per 100,000 people in the preceding year

POLL

### descriptive claims:

claims about what exists (or has existed/will exist) in the world:

• what kinds of things exist? (nominal)
• what is the type of this case? (nominal)
• how much of something is there? (ordinal, interval, absolute ratio)
• how much of something is there here vs. there/now vs. then? (ordinal, interval , relative ratio)
• what patterns are there in the shared appearance/non-appearance of different phenomena (depends)

## Variables can fail:

Even if we develop a useful concept…

## Variable Trouble: Validity

#### validity:

• Degree of fit between a variables the concept the variable is intended to capture.
• When a variable “captures” or “maps onto” the concept we are interested in, then we say they have “validity”
• When a variable “captures” or “maps onto” other concepts we are not interested in, then we say they lack “validity”

## Variable Trouble: Validity

“Which country/province is most politically corrupt?”

Concept: Political Corruption or “the use of power by government officials for illegitimate private gain”

Variable: Fraction of politicians in a place prosecuted for corruption

Measure: Match criminal court defendants in corruption prosecutions to list of politicians.

### Does this variable have validity?

“Which country/province is most politically corrupt?”

Concept: Political Corruption or “the use of power by government officials for illegitimate private gain”

Variable: Fraction of politicians in a place prosecuted for corruption

Measure: Match criminal court defendants in corruption prosecutions to list of politicians.

### Problems

• Places with lots of corruption do not prosecute corruption
• Places with low corruption successfully prosecute corruption
• Places have different corruption laws.
• But the measure may give correct values for the variable

## Variable Trouble: Validity

There was extensive election fraud in the 2020 US Presidential Election

Concept: election fraud: manipulations of votes or vote counting by changing the candidate or counting ineligible ballots

Variable: Abnormally high voter turnout in constituencies with fraud allegations (would result from higher rates of filling out absentee ballots for people who hadn’t voted, dead people voting, ineligible people voting, or even payments to legally registered people for their votes)

Measure: Statistically significant difference in turnout (Number of votes cast divided by registered voters according to official election results in US 2020 Presidential election) between counties with fraud accusations vs. others. see here

## Objectives

1. Distinguish between Variables and Measures
2. Variables:

• levels of measurement
• absolute/relative
3. Validity

• what is it?
• why do variables not have it?