Date Assigned: October 4/6
Title of Assignment: Data Description
Due Date: October 11/13
Points: 10
Why is univariate analysis useful?
Computing basic frequency distributions is sometimes valuable in itself
for descriptive research. However in explanatory research such as we are
interested in POL242, basic data description is primarily used to get an idea of
the
measurement and variation of the variables to be analyzed. In future
assignments, you should always begin by looking at a frequency
distribution of the variables you intend to use. Moreover, you
need to be sure at what level your variables are measured for this determines
which statistical techniques you can use. And you need to check whether
there is enough variation in the responses to the questions to warrant
including the variables in further explanatory research. If there is
little variation in the variables, for example if a very large majority
of respondents give the same answer or the distribution is sharply
skewed, it is difficult to discern whether differences on one variable
are related to differences on another variable. In addition, closely
looking at frequency distributions can help you decide which categories
to treat as missing and which categories of a variable you may want to
recode or combine (collapse) before proceeding. These decisions may be
made on the basis of either theoretical or practical considerations, or
both.
What do I need to do for this assignment?
A. Browse through the questionnaires listed under the codebook section of
the research methods website.
B. Choose four questions, two that you think may be used in an analysis as
indicators for dependent variables, and two that may be used as indicators
for independent variables. Select interesting dependent variables,
i.e., ones likely to be of use throughout the term and perhaps in later
crafting an interesting presentation (no more than one of your dependent
variables can come from the in-class lab exercises).
1. Describe one dependent variable using your own words, as well as listing the
available indicator's name (like CPS_B3_3 or Q380)
2. In which dataset is this variable's indicator found?
3. What do you hypothesize causes or influences variation in this dependent variable?
4. Which independent variable(s) do you hypothesize will influence the
dependent variable in Question # 1? Once again, describe the variable
using your own words as well as listing the indicator's name. The indicators for
your independent variable must be drawn from the same dataset as the
indicator for your dependent variable..
5. Describe a second dependent variable using your own words
as well as by its name.
6. In which dataset is this variable found?
7. What do you hypothesize causes or influences variation in this second dependent variable?
8. Which independent variable(s) do you hypothesize will
influence variation in your second
dependent variable? Once again, describe the variables
using your own words as well as it identifying name in the data set. Again the
indicators for the independent and independent variables must be from the
same dataset..
C. Compute frequency distributions from the appropriate data
sets for each of the four variables you have selected. You may need to make several computer runs to look
at preliminary frequency distributions to decide which values should be
declared as missing and what recodes to make (either combining categories
of changing the direction of the values). If you do not remove missing
values and make the necessary recodes, the values of your descriptive
statistics will not be accurate. While it is possible at this point to
select variables that require no recodes and such, it is advisable to
learn how to do so now as you will ultimately have to do so for your term
report.
For each of the variables, answer the following questions. We
recommend that you answer the questions on the same page as the final
distribution output and that you describe each variable on separate pages.
9. Describe any
missing values you have declared and why you did so.
10. Describe any recodes you performed. Why did you make
these recodes?
11. What is the measurement level of the variable (nominal, ordinal or
interval? See Korey on Levels of Measurement for details
12. What is the mean? mode? median?
13. Which measure is most useful? Why?
14. Dispersion: How spread out are the values of this variable? What is
the variance? The standard deviation? In your own words, what do these
measures indicate for this variable?
13. What shape is the distribution?
a. Is the distribution skewed? Report the skewness measure and interpret
the results in your own words.
b. What is the kurtosis of the distribution? In your own words, what do
these results mean?
c. Does the distribution have any other interesting features (e.g., is it
bimodal?).
14. Based on these results, how useful do you think the variable will be
in for future analyses? Is there enough variation in the indicator?

Assignments