HOMEWORK #3 due 10-18
Text references are to Cody & Smith,
DrA1: 13.32 18.87 14.61 15.02 15.42 16.23 14.01 DrA2: 17.01 18.14 18.06 18.46 15.91 16.94 14.50 DrH1: 17.83 18.13 19.89 19.01 16.84 19.53 14.77 DrC2: 20.83 19.87 21.04 17.12 20.50 17.55 20.17 DrC3: 19.62 19.03 20.11 20.52 21.05 20.21 25.91
DrA
drugs should
behave similarly in the human body due to a similar chemical structure,
but that the two DrC
drugs should be metabolized differently.
Using the same MSE as in the previous analyses, test whether or not the
AVERAGE of the two DrA
drugs is significantly different from
the AVERAGE of the two DrC
drugs. What is the P-value?
(Hint: Use a Contrast test. See for example
OnewayMC.sas
on the Math475 Web site.)
Level1 79 79 95 109 118 150 Level2 84 95 100 105 119 135 Level3 109 114 121 123 124 145 Level4 91 106 119 150 151 151 Level5 110 113 129 131 145 165
yy
and stress
) for each of 16 gnus under various conditions of
stress are given in the following table. (In each of the following 16
pairs of data, yy
is the first variable and
stress
the second variable.)
47 3.0 50 1.8 110 7.9 1655 15.7 179 9.1 55 5.2 1310 12.9 2773 15.1 56 3.6 62 2.9 3052 16.8 126 7.2 866 12.6 175 8.6 2731 16.7 249 9.0
yy
on
stress
with this data? What P-value does SAS report? What
is the model R2 ?
yy
versus
stress
. Include the predicted values on the same plot with
plot symbol P
as a comparison. Does the plot of
yy
versus stress
look linear? How well does it
follow the predicted values? (Hint: It might look slightly bowed
down in the middle.)
yy
on stress
against stress
. Do
the residuals look consistent with the assumptions of a linear
regression? Do their signs and absolute values appear to be randomly
distributed with respect to stress
? (Hint: The
negative residuals may be bunched together in the center.)
yy
on both stress
and
stress*stress
. (Hint: Introduce a new SAS variable
stress2
for stress*stress
.) What is the new
model R2 ? In a plot of yy
on
stress
, do the predicted values appear to match
yy
more closely? Do the residuals have a more
random-looking plot on stress
? (Hint: Observations
with higher values of stress
may also have larger
residuals.)
logyy=log(yy)
on
stress
and stress*stress
. What is the new
model R2 ? Do the predicted values of
logyy
appear to match the observed values more closely?
Does the residual plot show less dependence on stress
?
DO ALL OF PROBLEM 3 in one SAS Program.
Y
along with two covariates. Being utterly devoid of
imagination, the experimenter calls the covariates AA
and
BB
. The 30 instances of values Y,AA,BB
are
1. 714 366.3 1421 2. 1022 435.8 1737 3. 267 276.1 532 4. 287 199.6 571 5. 716 257.4 1115 6. 434 203.5 1011 7. 943 248.1 1676 8. 356 186.5 712 9. 423 246.2 624 10. 698 196.3 1312 11. 92 227.8 1151 12. 227 206.4 687 13. 589 178.7 1215 14. 716 296.6 1099 15. 324 235.9 843 16. 552 449.1 1504 17. 741 259.7 1227 18. 437 291.7 439 19. 143 198.0 265 20. 409 336.0 939 21. 654 279.8 438 22. 666 243.0 379 23. 479 318.3 1208 24. 212 176.5 217 25. 375 266.4 674 26. 184 226.0 522 27. 220 114.4 683 28. 392 231.4 929 29. 555 203.5 662 30. 862 328.6 906
Y
on the covariates
AA
and BB
? Run proc glm
in SAS
to find out. What is the model P-value? What is the model
R2 ? What is the value of the F-statistic that led to the
model P-value? How many degrees of freedom does it have in its numerator
and denominator?
proc glm
output? What are their P-values?
proc glm
output? What are their P-values? Why are the
answers different from those in part (ii)?
proc reg
to construct a table of Studentized
residuals and CookD statistics for each observation. Which observation
corresponds to the odd observation in part (iv)? What is the CookD
value for that particular observation? Does the CookD value seem large? In
general, do any of observations have CookD values that are large?
proc reg
, enter a ``paint''
command like (for example) paint ord=17 / symbol='X';
BEFORE
the plot
statements, or
proc plot
, enter the
plot
statement as (for example) plot Y*X $ ord
or plot Y*X='*' $ ord
. The $
causes the value of the variable ord
to displayed next
to each plotted point.)
proc glm
or proc reg
on the data set
without the apparent outlier in the residual plots. Do the parameter
estimates seem qualitatively the same as before? The P-values of the
parameters? The Rsquare value?
proc glm
or proc reg
starting from the
original data set but without the value with the large CookD value. Do the
parameter estimates seem qualitatively the same as before? The P-values of
the parameters? The Rsquare value? Which observation made the most
difference?