How to predict survival probabilities in R?

2019-02-15 04:57发布

问题:

I have data called veteran stored in R. I created a survival model and now wish to predict survival probability predictions. For example, what is the probability that a patient with 80 karno value, 10diagtime, age 65 and prior=10 and trt = 2 lives longer than 100 days?

In this case the design matrix is x = (1,0,1,0,80,10,65,10,2)

Here is my code:

library(survival)
attach(veteran)
weibull <- survreg(Surv(time,status)~celltype + karno+diagtime+age+prior+trt ,dist="w")

and here is the output:

Any idea how to predict the survival probabilities?

回答1:

You can get predict.survreg to produce predicted times of survival for individual cases (to which you will pass values to newdata) with varying quantiles:

 casedat <- list(celltype="smallcell", karno =80, diagtime=10, age= 65 , prior=10 , trt = 2)
 predict(weibull, newdata=casedat,  type="quantile", p=(1:98)/100)
 [1]   1.996036   3.815924   5.585873   7.330350   9.060716  10.783617
 [7]  12.503458  14.223414  15.945909  17.672884  19.405946  21.146470
[13]  22.895661  24.654597  26.424264  28.205575  29.999388  31.806521
[19]  33.627761  35.463874  37.315609  39.183706  41.068901  42.971927
[25]  44.893525  46.834438  48.795420  50.777240  52.780679  54.806537
[31]  56.855637  58.928822  61.026962  63.150956  65.301733  67.480255
[37]  69.687524  71.924578  74.192502  76.492423  78.825521  81.193029
[43]  83.596238  86.036503  88.515246  91.033959  93.594216  96.197674
[49]  98.846083 **101.541291** 104.285254 107.080043 109.927857 112.831032
[55] 115.792052 118.813566 121.898401 125.049578 128.270334 131.564138
[61] 134.934720 138.386096 141.922598 145.548909 149.270101 153.091684
[67] 157.019655 161.060555 165.221547 169.510488 173.936025 178.507710
[73] 183.236126 188.133044 193.211610 198.486566 203.974520 209.694281
[79] 215.667262 221.917991 228.474741 235.370342 242.643219 250.338740
[85] 258.511005 267.225246 276.561118 286.617303 297.518110 309.423232
[91] 322.542621 337.160149 353.673075 372.662027 395.025122 422.263020
[97] 457.180183 506.048094
#asterisks added

You can then figure out which one is greater than the specified time and it looks to be around the 50th percentile, just as one might expect from a homework question.

png(); plot(x=predict(weibull, newdata=casedat,  type="quantile", 
             p=(1:98)/100),  y=(1:98)/100 , type="l") 
dev.off()