LEVELS OF MEASUREMENT

LEVELS OF MEASUREMENT

Subtopics

Introduction

When data are prepared for analysis by computer, values of variables are usually entered as numbers. Sometimes such coding is natural — for example, the population of a country or the number of votes received by a candidate. Sometimes, artificial numerical codes are created for convenience of processing. In a file containing data on the voting preferences of Canadians, for example, Liberals might be coded numerically as 1, Conservatives as 2, and New Democrats as 3. Numerical, however, is not the same thing as quantitative. In fact, whether data are coded numerically or not, there are different "levels of measurement."

Nominal Data

The values of a nominal variable do not indicate the amount of the thing being measured, nor are they in any particular order. If coded numerically, the numbers chosen are arbitrary. For example, if we list the regions of the Canada as Atlantic, Quebec, Ontario, Prairie and B.C., we are not indicating the amount of "regionness" each possesses, nor listing them in order of "regionness." We may code the regions as "1," "2," "3," "4," & "5" respectively, but this is done merely for convenience and in no way quantifies what we are doing. Each value, numerical or otherwise, is merely a label or name (hence the term "nominal).

Ordinal Data

Sometimes the values of a variable are listed in order. (Alternatively, we say that the values are "ranked" or "rank ordered.") For example, the Canadian Services orders (ranks) it military personnel from General/Admiral to Private/Seaman. And in questionnaires we sometimes ask respondents whether something such as gender equality is "very important," "somewhat important," or "not important at all" to them. In both of these examples, the values of the variable in question (military rank or respondent attitudes) are ranked from highest to lowest, or vice versa. There are other kinds of ordering. For example, respondents in a survey may be asked to identify their political ideology as "very liberal," "liberal," "moderate," "conservative," or "very conservative," creating a scale rank ordered from most liberal to most conservative.

Interval Data

Sometimes, in addition to being ordered, the differences (or intervals) between any two adjacent values on a measurement scale are the same. For example, the difference in temperature between 80 degrees Fahrenheit and 81 degrees is the same as that between 90 degrees and 91 degrees. When each interval represents the same increment of the thing being measured, the measure is called an interval measure.

Ratio Data

Finally, in addition to having equal intervals, some measures also have an absolute zero point. That is, zero represents the absence of the variable being measured. Height and weight are obvious examples. Physicists sometimes use the Kelvin temperature scale, in which zero degrees means the complete absence of energy. The same is not true of the Fahrenheit or Celsius (Centigrade) scales. Zero degrees Centigrade, for example, represents the freezing point of water at sea level. But this does not mean that there is no temperature at this point. The choice to put zero degrees at this point on the scale is arbitrary. There is no particular reason why scientists could not have chosen instead the freezing point of beer in Golden, Colorado (other than that water is a more common substance, at least for most successful scientists). With an absolute zero point, you can calculate ratios (hence the name). For example, $20 is twice as much as $10, but 60 degrees Fahrenheit is not really twice as hot as 30 degrees. Ratio data is fully quantitative: it tells us the amount of the variable being measured. The percentage of votes received by a candidate, Gross Domestic Product per Capita, and the crime rate are all ratio variables.

Dichotomous Variables

Dichotomous variables (those with only two values) are a special case, and may sometimes be treated as nominal, ordinal, or interval. Take, for example, gender. Gender is, on its face, a pure example of a nominal variable, with the values of the variable being simply the labels of female and male (or arbitrary numbers used, for convenience, in place of the labels). On the other hand, we could treat gender (and other dichotomous variables) as ordinal, since there are only two possible ways for the values to be ordered, and it makes no difference which way is chosen. There is, therefore, no way that they can be listed out of order. Whether it is better to treat a dichotomous variable as nominal or ordinal depends on how we think of the underlying concept being measured. Do our values indicate two sharply distinct categories (e.g., female and male), or do we think of them as two points along a spectrum (e.g., degree of femininity and masculinity)?

For certain purposes, we can even treat dichotomous variables as interval, since there is only one interval (the difference between female and male or Group A and Group B), which is obviously equal to itself[1].

Why It Matters

Level of measurement is important because the higher the level of measurement of a variable (note that "level of measurement" is itself an ordinal measure) the more powerful are the statistical techniques that can be used to analyze it. With nominal data, you can count the frequency with which each value of a variable occurs. A voter's choice in the 2004 Canadian Federal Election, for example, is a nominal variable (with the values of the variable being "Martin, "Harper," “Layton"), and so you can count the number of votes received by each candidate. You can also calculate the percentage of votes each candidate received. You can calculate joint frequencies and percentages (how many and what percent of votes were cast in Ontario for Harper, for example). You can also use certain measures that tell you how strong the relationship is between region and vote, and the likelihood that the relationship occurred by chance. On the other hand, there are other operations you cannot legitimately perform with nominal data. Even if you use numbers to label candidates (e.g., 1 = Martin, 2 = Harper, 3 = Layton), you cannot very well say that Martin plus Harper equals Layton, or that Layton divided by Harper is half way between Matrtin and Harper. Unfortunately, there are many statistical techniques that require higher levels of measurement.

With ordinal data, you can employ techniques that take into account the fact that the values of a variable are listed in a meaningful order. With interval data, you can go even further and use powerful techniques that assume a measurement scale of equal intervals. As it happens, there are very few techniques in the social sciences that require ratio data, and so some textbooks ignore the distinction between interval and ratio scales.

If you use a technique that assumes a higher level of measurement than is appropriate for your data, you risk getting a meaningless answer. On the other hand, if you use a technique that fails to take advantage of a higher level of measurement, you may overlook important things about your data. (Note: in addition to level of measurement, many statistical techniques also require other assumptions about your data. For example, even if a variable is interval, some otherwise appropriate techniques may yield misleading results if the variable includes some values that are extremely high or low relative to the rest of the distribution.)

The distinctions between levels of measurement are not always hard and fast. Often it depends on the underlying concept being measured. In survey research, for example, those who answer "don't know" or "no opinion" are sometimes thought of as being somewhere in between the substantive preferences of agree and disagree, and so measures attitudes such as support for the Kyoto Accord or same-sex marriage are often treated as ordinal. Sometimes the question of level of measurement hinges on the precise way in which the measure was taken. For example, the election studies have for many years been using "feeling thermometers." Respondents are asked to locate a person (e.g., a party leader) or a category of people (Liberals, New Democrats, feminists, Francophones, etc.) on a scale ranging from 0 to 100, with higher numbers representing warmer feelings toward the person or category of people in question. Most researchers using these variables have treated them as interval. Some, however, have raised doubts about this practice. For example, does the difference between a rating of, say, 60 and 70 really mean the same thing as the difference between 90 and 100?

In designing research, there can also be tradeoffs between having data that are at a higher level of measurement and other considerations. Aggregate data (data about groups of people) are generally interval or ratio, but usually provide only indirect measures of how people think and act. Individual data get at these things more directly, but are usually only nominal or ordinal. Official election returns, for example, can provide us with ratio level data about the distribution of votes in each precinct. These data, however, tell us little about why individual people vote the way they do. Survey research (public opinion polling), which provides data that are for the most part only nominal or ordinal, allows us to explore such questions much more extensively and directly.

Other Ways to Classify Data

Sometimes you will find other terms used to describe the level of measurement of variables. SPSS, for example, distinguishes among nominal, ordinal, and scale (that is, interval or ratio) variables. Some texts distinguish between nonparametric (nominal or ordinal) and parametric (interval or ratio) variables. In describing different statistical procedures, we will sometimes distinguish between categorical and continuous variables. Catagorical variables generally consist of a small number of values, or categories, and are usually nominal or ordinal. The values of continuous variables represent a large or even infinite number of possible points along a scale, and are interval or ratio.

Exercises

1. Open the Canadian Election study in Webstats. Notice also that almost all of the variables are either nominal or ordinal. With survey data, this is typical. Now open “countries.sav” and do the same. Notice that almost all of the variables are scale, as is usually the case with aggregate data.

2. Open a codebook for another dataset included on Webstats. Classify each variable as nominal, ordinal, interval, or ratio. Note that in some cases, there may be more than one correct answer, depending on what assumptions are made.

For Further Study

Clark, Candace, “Levels of Measurement,” Sociology 240, Social Statistics. http://www.chss.montclair.edu/sociology/statbooklevels.htm.

Lane, David, et al. “Levels of Measurement,” Online Statistics: A Multimedia Course of Study. http://psych.rice.edu/online_stat/chapter1/levels_of_measurement.html.

[1] See the section on "dummy" variables under the regression analysis topic.