HOMEWORK #4 due 11-15
NOTES: (THIS IS ALSO on the Math434 Web site.)
2. NOTE: If a problem asks you to do a statistical test, EXPLAIN CLEARLY what the null hypothesis H_0 is, what the alternative H_1 is, what test you used, what the P-value is, and whether the data is significant, highly significant, or neither. Include this as part of your answer in part (i).
Table 1 - Data for 80 subjects Values for each subject are (a) group (Green or Blue), (b) failure time or time last seen, (c) status (0 for observed failure, 1 for censored event), and (d) the values of two covariates (DVAL and FVAL). 1. Green 19 0 43 85 41. Blue 88 0 33 63 2. Green 23 0 45 77 42. Blue 89 0 38 41 3. Blue 24 0 45 51 43. Green 91 0 49 77 4. Blue 26 0 40 66 44. Blue 93 0 20 50 5. Green 30 0 26 83 45. Green 93 0 32 54 6. Blue 36 0 34 68 46. Green 95 0 27 76 7. Green 36 0 32 62 47. Green 96 0 48 53 8. Blue 40 0 35 48 48. Green 97 0 49 57 9. Green 49 0 17 47 49. Green 99 0 26 52 10. Green 50 0 17 48 50. Green 104 0 36 40 11. Green 54 0 48 47 51. Green 104 0 20 42 12. Blue 55 0 40 46 52. Green 106 0 20 44 13. Blue 56 0 44 53 53. Green 107 0 33 68 14. Blue 60 0 14 63 54. Green 107 0 13 56 15. Blue 60 0 33 62 55. Green 108 0 36 66 16. Blue 62 0 44 45 56. Green 108 0 40 57 17. Green 62 0 33 52 57. Green 111 0 43 38 18. Green 67 0 28 59 58. Green 113 0 32 65 19. Blue 68 0 41 51 59. Green 116 0 30 57 20. Green 69 0 35 47 60. Green 119 0 36 71 21. Blue 69 0 24 52 61. Green 122 0 30 72 22. Green 69 0 43 58 62. Green 132 0 31 60 23. Green 70 0 22 66 63. Blue 142 0 16 35 24. Blue 70 0 29 45 64. Green 150 0 38 42 25. Green 71 0 41 58 65. Green 153 0 26 35 26. Blue 71 0 17 79 66. Blue 23 1 32 76 27. Blue 72 0 32 53 67. Blue 30 1 40 72 28. Blue 73 0 43 47 68. Green 33 1 38 65 29. Green 79 0 40 60 69. Blue 34 1 43 68 30. Blue 80 0 30 48 70. Green 59 1 44 67 31. Blue 80 0 41 75 71. Green 68 1 42 51 32. Blue 83 0 31 62 72. Blue 72 1 35 67 33. Green 83 0 37 73 73. Green 78 1 49 47 34. Blue 83 0 43 55 74. Green 86 1 14 74 35. Green 84 0 40 77 75. Green 87 1 42 54 36. Green 85 0 43 70 76. Blue 89 1 24 62 37. Blue 85 0 22 62 77. Green 100 1 41 49 38. Green 85 0 46 64 78. Green 115 1 19 48 39. Green 85 0 17 74 79. Blue 115 1 39 44 40. Green 88 0 16 42 80. Green 131 1 37 63
(i) Analyze the data in Table 1 using an AFT Weibull regression on Group
(as a class variable), DVAL, and FVAL.
Do the failure times depend significantly on the group? On DVAL? On FVAL?
Find the P-values for the variables that are significant.
If Group is significant, which group (Green or Blue) has the longer
expected survival time? How can you tell from the output? If DVAL or FVAL
is significant, do larger values of that variable lead to longer survival
times or shorter survival times? How can you tell from the output?
(ii) What value of the Weibull parameter alpha does SAS estimate, if the Weibull distribution is written as SX(t) = exp(-(lambdaX t)a) for a=alpha? Does the confidence interval for alpha overlap alpha=1?
(iii) Answer the questions in part (i) for an AFT model with exponential errors. Are any of your conclusions different? In particular, does assuming exponential instead of Weibull-distributed errors increase the significance of the covariates for the data in Table 1, decrease the significances, or leave them about the same?
(iv) Does an AFT model with exponentially-distributed errors fit these data as well as an AFT model with Weibull errors? Find a P-value for the hypothesis that the failure times are consistent with an exponential model within the alternative of a Weibull model. Do you conclude that a Weibull model would be more consistent with the data, would be less consistent with the data, or would be about the same?
What is the ratio of the estimated hazard rates between individuals in the two groups? Are Green individuals at a greater or smaller hazard? (Hint: Be careful. Recall that a longer lifetime means a smaller hazard rate, and vice versa, and keep track of confounding with alpha.)
Table 2 - Covariates AA BB CC for 32 subjects that later either developed or did not develop Condition X Developed Condition X Did NOT develop Condition X Subj AA BB CC Subj AA BB CC 1 69 83 51 13 36 55 39 2 51 74 32 14 50 69 44 3 27 68 33 15 36 59 28 4 55 85 46 16 31 26 44 5 27 99 34 17 31 49 47 6 44 68 38 18 32 45 50 7 49 88 57 19 40 59 33 8 28 64 66 20 49 51 42 9 32 58 46 21 38 70 47 10 47 81 39 22 46 63 26 11 35 77 31 23 46 64 47 12 30 69 62 24 67 94 43 25 47 60 56 26 56 62 45 27 39 64 27 28 52 71 24 29 33 62 52 30 57 63 48 31 39 78 23 32 48 70 55
proc means n mean; class xx; var aa bb cc;
run;
where xx takes on the values `X' or `NotX'.
WARNING: Make sure that your regression is predicting Prob(X) given the
covariates and NOT Prob(NotX). See the discussion of the
descending
option in the comments in the file
LGexamps.sas
on the Math434 Web site.)