Assignment 1 2005-2006: Data Description

Want to print this assignment? Click here
 

Date Assigned: October 4/6
Title of Assignment: Data Description
Due Date: October 11/13
Points: 10


Why is univariate analysis useful?

Computing basic frequency distributions is sometimes valuable in itself
for descriptive research. However in explanatory research such as we are interested in POL242, basic data description is primarily used to get an idea of the measurement and variation of the variables to be analyzed. In future assignments, you should always begin by looking at a frequency
distribution of the variables you intend to use. Moreover, you need to be sure at what level your variables are measured for this determines which statistical techniques you can use. And you need to check whether there is enough variation in the responses to the questions to warrant including the variables in further explanatory research. If there is little variation in the variables, for example if a very large majority of respondents give the same answer or the distribution is sharply skewed, it is difficult to discern whether differences on one variable are related to differences on another variable. In addition, closely
looking at frequency distributions can help you decide which categories to treat as missing and which categories of a variable you may want to recode or combine (collapse) before proceeding. These decisions may be made on the basis of either theoretical or practical considerations, or both.

 

What do I need to do for this assignment?

 

A. Browse through the questionnaires listed under the codebook section of the research methods website.

B. Choose four questions, two that you think may be used in an analysis as indicators for dependent variables, and two that may be used as indicators for independent variables.  Select interesting dependent variables, i.e., ones likely to be of use throughout the term and perhaps in later crafting an interesting presentation (no more than one of your dependent variables can come from the in-class lab exercises).

1. Describe one dependent variable using your own words, as well as listing the available indicator's name (like CPS_B3_3 or Q380)

 

2. In which dataset is this variable's indicator found?

3. What do you hypothesize causes or influences variation in this dependent variable?

 

4. Which independent variable(s) do you hypothesize will influence the dependent variable in Question # 1? Once again, describe the variable using your own words as well as listing the indicator's name. The indicators for your independent variable must be drawn from the same dataset as the indicator for your dependent variable..

 

5. Describe a second dependent variable using your own words as well as by its name.

 

6. In which dataset is this variable found?

7. What do you hypothesize causes or influences variation in this second dependent variable?

 

8. Which independent variable(s) do you hypothesize will influence variation in your second dependent variable? Once again, describe the variables
using your own words as well as it identifying name in the data set. Again the indicators for the independent and independent variables must be from the same dataset..

 

C. Compute frequency distributions from the appropriate data sets for each of the four variables you have selected. You may need to make several computer runs to look at preliminary frequency distributions to decide which values should be declared as missing and what recodes to make (either combining categories of changing the direction of the values). If you do not remove missing values and make the necessary recodes, the values of your descriptive statistics will not be accurate. While it is possible at this point to select variables that require no recodes and such, it is advisable to learn how to do so now as you will ultimately have to do so for your term
report.

For each of the variables, answer the following questions. We recommend that you answer the questions on the same page as the final distribution output and that you describe each variable on separate pages.

 

9. Describe any missing values you have declared and why you did so.

 

10. Describe any recodes you performed. Why did you make these recodes?
 

11. What is the measurement level of the variable (nominal, ordinal or
interval? See Korey on Levels of Measurement for details

12. What is the mean? mode? median?

13. Which measure is most useful? Why?

14. Dispersion: How spread out are the values of this variable? What is
the variance? The standard deviation? In your own words, what do these
measures indicate for this variable?

13. What shape is the distribution?

a. Is the distribution skewed? Report the skewness measure and interpret
the results in your own words.

b. What is the kurtosis of the distribution? In your own words, what do
these results mean?

c. Does the distribution have any other interesting features (e.g., is it
bimodal?).

14. Based on these results, how useful do you think the variable will be
in for future analyses? Is there enough variation in the indicator?

 

Assignments