Behavioral Statistics

 

Homework #2 - Numerical Descriptive Statistics

For each problem, Download the data from the Problem Description (see following):

For Helpful Hints, Click Here

Problem Description 1

As part of its recruiting process, the human resources department of a national company has administered an aptitude test to 100 applicants. Using SPSS, find the mean, median, mode, variance, standard deviation, and the range.

Datafile:

Remember to right click the mouse to save!

Problem Description 2

The director of the Master's of Psychology program wishes to determine the characteristics of the current students (using GRE scores). Using SPSS, find the mean, median, mode, variance, standard deviation, and the range.

Datafile:

Remember to right click the mouse to save!

Problem Description 3

The amount of time needed to complete a telephone survey by 100 respondents is stored in file (Times are rounded to the nearest whole minute.)
        a. Use SPSS to produce the mean, median, and mode.
        b. Describe briefly what each measure tells you about the data.

Problem Description 4

The summer incomes of a sample of 125 second-year psychology students are stored in file .
        a. Calculate the mean and median of these data.
        b. What do the two measures of central location tell you about second-year psychology students’ summer incomes?
        c. Which measure would you use to summarize the data? Explain.

 

Problem Description 5

A high school student named David Merrell did a fascinating study of the effects of listening to rock music on the performance of rats in a maze. He had three groups of rats, one raised in the presence of rock music (performed by the group Anthrax), one raised in the presence of music by Mozart, and one raised in the absence of music. These animals learned to navigate a maze before exposure to the music, and then performed over three additional weeks.  The data for this study is found in the file =

The variables in the file are, in order, Subject, Group [1 = Control, 2=Mozart, 3=Anthrax], wk1r1, wk1r2, wk1r3, wk2r1 ... wkk4r3 [4 weeks of 3 runs each], week1 week2 week3 week4 [weekly means], wt1, wt2, wt3, wt4 [weekly weights], median1--median4 [weekly medians].

Problem Description 6

An example that we will look at several times in the future comes from a study by Mireault (1990) investigating the effects of the death of a parent on the emotional well-being of college students. Among other things, she asked three different groups of college students to rate the perceived vulnerability to loss--i.e., how vulnerable did they feel about the loss of someone important two them. The three groups were (1) a group who had had a parent die before they started college, (2) a group whose parents had divorced, and (3) a group whose parents were both alive and still married to each other. Download these data from .

There are many variables here. They are, in order, ID, Group, Gender, YearColl, College, GPA, LostPGen, AgeAtLos, SomT, ObsessT, SensitT, DepressT, AnxT, HostT, PhobT, ParT, PsycT, GSIT, PVTotal, PVLoss, SuppTotl. We are interested in Group and PVLoss. The other variables will come up in other exercises.

Problem Description 7

Most of us have grown up to think of the geyser at Yellowstone named Old Faithful as just that--faithful and reliable. But actually it isn't very faithful at all, with times between eruptions varying between about 45 minutes and 90 minutes (And it has gotten worse in the last few months, following recent earthquake activity.) Chatterjee et al. (Chatterjee,S., Handcock, M.S., & Simonoff, J. S. (1995) Casebook for a First Course in Statistics and Data Analysis. New York: Wiley) presented data on the timing of nearly 300 eruptions, as well as the length of each eruption. The data:

The authors currently have these (and other) data available at geyser2a.dat The variables, in order, are length of previous eruption, interval between eruptions, and a dichotomized version of the first variable. Draw Histograms for the Length of Previous Eruption and the Dichotomized version of this variable.

  • What meaning would you attribute to the standard descriptive statistics?
  • Is the length of each eruption as variable as the Interval between them?
    What kinds of things might you look at to explain the variability in Intervals?
     

Comments from Samprit Chatterjee, Mark Handcock and Jeffrey Simonoff:

The Old Faithful geyser is a wonderful national icon, and is also a wonderful source of interesting data, due to its non-faithful faithful appearance. What do we mean by "non-faithful"? It is well-known that the time interval between eruptions of the geyser is not faithful at all (in the sense of being around one value consistently), as it has a bimodal distribution. About one-third of the time the time between eruptions is roughly 55 minutes, while about two-thirds of the time it is roughly 80 minutes. Unfortunately, the description in the article of average time intervals in different years gives the mistaken impression of a unimodal distribution.

We have used Old Faithful eruption data (circa 1978, 1979 and 1985) in our introductory classes for many years, and made a case based on these data the lead case in our book "A Casebook for a First Course in Statistics and Data Analysis" (Wiley, 1995), since students are often very surprised to find out just how faithful (or non-faithful) Old Faithful is. The specific characterization of the bimodal distribution mentioned in the previous paragraph comes from those data. The case also points out that a simple way to predict the time interval until the next eruption is to check whether the duration of the previous eruption was short (less than 3 minutes) or long (more than 3 minutes), and predict accordingly (55 minutes until the next eruption, or 80 minutes, respectively). This rule, derived using the 1978/1979 data, correctly predicts the 1985 values to within plus or minus 10 minutes about 90% of the time, right in line with what the Times article states.

 
Time trouble for geyser: It's no longer Old Faithful.
The New York Times, 5 Feb. 1996, D1
James Brooke


Rick Hutchison, Yellowstone National Park's research geologist, reported that Old Faithful, the park's leading tourist attraction, has been slowing down. In 1950, the average time interval between eruptions was 62 minutes, in 1970 it was 66 minutes, and today it is 77 minutes. It is also apparently becoming more difficult to predict the time until the next eruption, with forecasts now being to within plus or minus ten minutes.

The changes of recent years seem to be produced by seismic activity. Scientists theorize that earthquakes can have two effects on geysers, either speeding up or slowing down the rate of supply of water. Quakes can either shake loose debris that clog rock channels that feed water to a geyser, resulting in more water and steam, or can crack open new underground channels, redirecting water to other geysers or hot springs. It is speculated that the latter process is affecting Old Faithful.

 


For Helpful Hints, Click Here


© 2008, David M. Compton, Ph.D.