Click here for Prof. Sawyer's
home page
TAKEHOME FINAL due on or before Wed 12-21 by 3 P.M.
(Return to Prof. Sawyer or
to math receptionist in Cupples I Room 100.)
NOTE: There should be NO COLLABORATION on the takehome final,
other than for the mechanics of using the computer.
Open textbook and notes (including course handouts).
In general where the results of a statistical test are asked for,
(i) EXPLAIN CLEARLY what the hypotheses H0 is and what
alternative you are testing against,
(ii) find the P-value for the test indicated (and state what test
you used), and
(iii) state whether the results are significant (P<0.05), highly
significant (P<0.01), or not significant (P >= 0.05). If the P-value
is based on a Student's t or Chi-square or F distribution, also give the
degrees of freedom. (WARNING: An F distribution has TWO degrees of
freedom, one for the numerator and one for the denominator.)
ORGANIZE YOUR WORK in the following manner:
(i) your answers to all questions,
(ii) all your SAS programs, and
(iii) all your SAS output.
ADD CONSECUTIVE PAGE NUMBERS to part (iii) of your homework so that
you can make references from part (i) to part (iii). For
example, so that you can say things like, ``The answer in part (a)
is 57.75. The scatterplot for part (b) is on page #Y below.'' It
may be clearest to write page numbers yourself on the SAS output.
Different parts of problems may not be equally weighted.
5 problems.
Problem 1. Heights and weights for the employees of Vaporware
Computer Services are recorded in Table 1. Each table entry has the
height, weight, and sex for one employee, in that order. The employees of
this company are known to be unusual.
Table 1 --- Heights and Weights for Employees
68 154 M 62 89 F 62 86 F 59 117 F
71 125 M 63 81 F 70 137 M 60 88 F
61 89 F 62 121 F 58 134 F 60 96 F
65 90 F 67 85 F 67 96 F 78 122 M
64 86 F 71 164 M 60 87 F 64 117 F
72 162 M 65 84 F 73 129 M 70 143 M
71 149 M 63 114 F 59 110 F 67 125 M
64 80 F 74 156 M 59 108 F 64 120 F
68 162 M 60 136 F 69 155 M 77 120 M
60 109 F 72 134 M 56 140 F 75 132 M
73 117 M 68 75 F 62 102 F 60 104 F
67 134 M 60 107 F 76 137 M 73 124 M
59 85 F 63 83 F 60 120 F 63 85 F
64 127 F 73 133 M 69 119 M 72 120 M
63 92 F 57 141 F 77 138 M 61 120 F
68 134 M 59 102 F 58 121 F 77 110 M
76 127 M 56 135 F 69 154 M 56 116 F
59 89 F 70 164 M 73 129 M 76 117 M
67 80 F 68 131 M
(i) For the employees grouped into two samples by sex, what are the
two sample sizes? What are the two sample means for height?
Is there a significant difference in height between the two sexes?
Use SAS to find out. What is the value of the t-statistic for the
classical two-sample t-test? What is the P-value? What is the number of
degrees of freedom of the t statistic?
The classical t-test assumes that the variances of the two samples
are the same. Is this a reasonable assumption here? Why? What is a
P-value for a hypothesis based on this assumption? Does this P-value mean
that it is safe to assume that the variances are the same, or the
opposite?
(ii) What is the Pearson correlation coefficient between height and
weight for the individuals in Table 1, ignoring sex? Is it
significantly different from zero? What is the P-value? Is this a
one-sided or a two-sided P-value, in this sense that it tests for rho
being either less than zero or greater than zero?
How was the P-value for the Pearson correlation coefficient
calculated? What is the number of degrees of freedom of the test
statistic that is used to calculate the P-value? What is the formula that
expresses this test statistic in terms of rho?
(iii) What are are the Pearson correlation coefficients between height and
weight for employees within each sex? Are they significant? Do they have
the same sign as the correlation coefficient in part~(iii)? How can the
correlation coefficients have one sign within groups but a different sign
for the two groups combined? Construct a height by weight scatter plot
using sex as the plotting symbol to illustrate your answer.
Problem 2. A marketing company carries out a survey to test
whether consumer evaluations of a product depend on gender. The results
of a large-scale study in three cities are shown in Table 2.
Table 2 - Consumer Impressions in Three Cities
City: City1 City2 City3
Opinion: Y N Y N Y N
Female 155 298 328 149 373 424
Male 71 162 599 328 125 185
(i) Based on this data, for all three locations together, is the product
viewed differently by females than by males? (Carry out an appropriate
two-sided test, which means that it should test for either more favorably
or less favorably.) WARNING: Since row and column proportions of the four
tables are NOT the same, combining the 2 by 2 tables into a single 2 by 2
table might lead to Simpson's Paradox.
What test did you use? Is the P-value one-sided or a two-sided? (That
is, is it also sensitive to the possibility that females might view the
product less favorably overall?)
(ii) Are females more likely or less likely to like the product within
each of the cities? What are the phi coefficients for the three
contingency tables? Does phi>0 mean that females view the product more
favorably than males, or less favorably?
(iii) Now aggregate the data (possibly incorrectly) into a single 2 by 2
table. How do your conclusions differ? What is the P-value for this
(possibly incorrect) 2 by 2 table? What is the new phi coefficient? Is
the sign of the new phi coefficient the same as at the individual
locations? Are the women in the aggregated table relatively more
favorable to the product in comparison with men than in the individual
locations, relatively less favorable, or the same?
Problem 3. An engineer is interested in the resonant
frequency of a mechanical device as a function of three variables:
Pressure, with three levels (Press1,Press2,Press3), Drubness, with two
levels (Drub1,Drub2), and Abrasiveness, with three levels
(Abr1,Abr2,Abr3). The resonant frequencies of two devices are measured
for each set of levels of the three variables. The resulting frequencies
are listed in Table 3.
Table 3. Frequencies of a Device
Press1 Press2 Press3
Drub1 Drub2 Drub1 Drub2 Drub1 Drub2
Abr1 3839 3202 326 117 5950 1254 357 1550 484 227 1915 2924
Abr2 1313 3202 276 368 1574 8814 530 538 1046 1128 1373 2795
Abr3 2097 6417 374 429 3614 1293 238 2476 201 886 1803 1647
(i) Use SAS to run a full factorial model with the three variables as
three factors. Is the Model Test significant? What is its P-value?
(ii) Plot the residuals against the predicted values in the model. In
order to get a better idea of the distribution of the residuals, include
the level of Pressure in the residual plot as the plotting symbol. (Make
sure that the plotting symbol identifies the Pressure level!)
Do the residuals appear to be independent of the predicted value and
of the value of Pressure? Why? Do the residuals appear to be normally
distributed? Carry out a test that provides a P-value for the normality
of the residuals.
(iii) Run the full factorial model again with the values in Table 3
replaced by their logarithms. Is the Model test now more significant?
Analyze the residuals of the log-transformed data in the same way as in
part (ii). Do they now look more independent of the predicted value
and of the value of Pressure?
(iv) Which of the main effects of Abrasiveness, Drubness, and Pressure
are significant for the log-transformed data? highly significant? Which
of the four interactions are significant? highly significant? What are
the P-values for the significant effects? For the effects that are
significant, what are the degrees of freedom for the F-tests involved,
numerator and denominator?
(v) How were the F-statistics calculated for the tests in part (iv)
that were significant? What was the denominator? Is the denominator of
the F-statistics in the output?
(vi) For each of the two-way interactions that are significant, display
an interaction plot. For each such interaction, is the interaction
visible in the interaction plot? What can you conclude about the
interaction and how it effects the dependent variable (that is, the
resonant frequency of the device)?
Problem 4. An international baseball organization conducts a survey to
compare the throwing expertise of catchers in a sample of Little League
teams distributed among 4 Leagues. Proficiency scores for making an
accurate throw from home to second base were made for 3 catchers on each
team. The international organization want to know where most of the
variation of catcher throwing skills is located: among teams within
leagues, between leagues, or a combination of both. The survey data is in
Table 4.
Table 4 --- Catcher throwing proficiencies by Team and League
League1
Team1 71 68 75 Team2 52 57 63 Team3 74 67 78
Team4 76 91 71
League2
Team1 56 54 57 Team2 70 66 64 Team3 71 62 62
League3
Team1 70 50 64 Team2 59 61 74 Team3 53 65 57
Team4 62 59 72 Team5 69 80 65 Team6 56 76 74
Team7 64 62 49 Team8 61 73 48 Team9 47 57 51
League4
Team1 74 78 62 Team2 78 76 73 Team3 64 54 50
Team4 70 68 66 Team5 65 72 73
Note that ``Team1'' does not refer to the same team in different
leagures, which might be in different parts of the world, but only to the
first team in that league that happened to send its catcher scores in to
the international organization. Treat the three observations for each
team as an independent sample for that team.
(i) Using within-team variation to estimate the error, was there
significant variation in the proficiency scores over the 15 or more teams
in the study, ignoring the leagues that contain them? What is the
P-value? What are the degrees of freedom of the resulting F statistic?
(ii) Analyze the appropriate ANOVA model taking into account both teams
and leagues. Is there significant variation in the scores by league? Is
there significant variation by teams within leagues? What are the
P-values in each case? What are the degrees of freedom of the two F
statistics involved?
(iii) For the analysis in part (ii), which pairs of leagues differ
significantly in terms of catcher scores? Run the appropriate Duncan
procedure to find out.
(iv) What are the MSS (Mean Sum of Squares) values for within-team
variation, between-league variation, and variation between teams within
leagues? Are these consistent with your answers to part (ii)? How
are the F-statistics in part (ii) computed in terms of these MSS
values?
(v) Is there significant variation in the scores by league, ignoring any
team structure within each league? (That is, assume that everybody in a
league is on the same team, including perhaps dozens of catchers.) What
is the P-value? What are the degrees of freedom of the F statistic? Why
is this P-value different from the P-value for league in part (ii)?
Problem 5. After reading the lab notebook and checking with
his technicians, the sound engineer in Problem 3 becomes concerned
that the second replications in Table 3 may not be reliable. The
second replication was done the week after the first replication under
less stable conditions. (The first and second replications are the first
and second values in each cell in Table 3, respectively, where cells
are the possible combinations of Pressure, Drubness, and Abrasiveness.)
The engineer wants to discard all of the second replications in
Table 3 and only analyze the first replication in each cell.
Unfortunately, this leads to data with three factors (Pressure, Drubness,
and Abrasiveness) and only one replication per cell.
A friend of the engineer in the forestry service tells the engineer about
split-plot analyses. The friend says that the assumptions appear to be
satisfied here. The model can be considered to have a major factor
(Drubness) and two minor factors (Pressure and Abrasiveness). For
engineering reasons, the two minor factors should not be expected to have
strongly nonadditive effects, so that the Pressure*Abrasiveness
interaction and the three-way interaction are plausible candidates to
estimate error. (Note that the sum of these two effects is
Pressure*Abrasiveness nested within Drubness.)
(i) Carry out this split-plot analysis for the log-transformed data
corresponding to the first replication. Answer the same questions in
parts (iii)-(vi) of Problem 3 for the smaller data set and for the
effects that can be tested in the split-plot analysis.
Among other things, find the residual plot and test normality, find
all effects that are significant, identify the denominator of the
F-statistics involved, and display interaction plots for significant
interactions. (Warning: Don't forget to use log-transformed data!)
(Hint: If you constructed the SAS data set for Problem 3
with a variable Rep=1 for the first replicaton and Rep=2 for the second
replication in each cell, then you can subset the data set using the
command if Rep=1;
.)
(ii) The residual plot for the subsetted data contains what may be
a suspicious value. This value (which has Pressure=Press3) can be
identified as the observation with the largest positive residual. Find
the the Studentized residual and the CookD statistic for this
observation. Is this apparent outlier troublesome, on the basis of a
Studentized residual larger than 3.0 or a CookD statistic larger
than 1.0?
(Note: A more standard rule for the CookD statistic is whether
the CookD value is greater than the median of the F-distribution with
parameters F(r,n-r), where n is the number of observations and r is the
number of fitted parameters. Using 1.0 instead is a rough
approximation.)
Top of the Final