Author: Johan Zetterqvist, 2015-03-04
Now fit a model to the localised melanoma data where the outcome is observed survival (i.e. all deaths are considered to be events).
You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest")
for each package you require.
require(foreign) # for reading data set from Stata
require(survival) # for Surv and survfit
Load the melanoma data and explore it.
diet <- data.frame(read.dta("http://biostat3.net/download/diet.dta"))
(a) Fit the following Poisson regression model to the diet data (we fitted this same model in question 6). Now fit the following Cox model.
## y is the observed time
## so it already measures time since entry
poisson13a <- glm( chd ~ hieng + offset( log( y ) ), family=poisson, data=diet)
summary(poisson13a)
##
## Call:
## glm(formula = chd ~ hieng + offset(log(y)), family = poisson,
## data = diet)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7382 -0.6337 -0.4899 -0.3891 3.0161
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.2980 0.1890 -22.744 <2e-16 ***
## hienghigh -0.6532 0.3021 -2.162 0.0306 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 262.82 on 336 degrees of freedom
## Residual deviance: 258.00 on 335 degrees of freedom
## AIC: 354
##
## Number of Fisher Scoring iterations: 6
exp(cbind(coef(poisson13a), confint(poisson13a)))
## 2.5 % 97.5 %
## (Intercept) 0.01359599 0.009161455 0.01927158
## hienghigh 0.52035993 0.282917136 0.93283916
cox13a <- coxph(Surv(y, chd) ~ hieng, data=diet, ties="breslow")
summary(cox13a)
## Call:
## coxph(formula = Surv(y, chd) ~ hieng, data = diet, ties = "breslow")
##
## n= 337, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## hienghigh -0.6475 0.5234 0.3022 -2.143 0.0321 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## hienghigh 0.5234 1.911 0.2895 0.9462
##
## Concordance= 0.581 (se = 0.038 )
## Rsquare= 0.014 (max possible= 0.781 )
## Likelihood ratio test= 4.73 on 1 df, p=0.02961
## Wald test = 4.59 on 1 df, p=0.03213
## Score (logrank) test = 4.75 on 1 df, p=0.02923
(b) Use the attained age as the timescale and refit the Cox model. Is the estimate of the effect of high energy different? Would we expect it to be different?
poisson13a <- glm( chd ~ hieng + offset( log( y ) ), family=poisson, data=diet)
summary(poisson13a)
##
## Call:
## glm(formula = chd ~ hieng + offset(log(y)), family = poisson,
## data = diet)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7382 -0.6337 -0.4899 -0.3891 3.0161
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.2980 0.1890 -22.744 <2e-16 ***
## hienghigh -0.6532 0.3021 -2.162 0.0306 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 262.82 on 336 degrees of freedom
## Residual deviance: 258.00 on 335 degrees of freedom
## AIC: 354
##
## Number of Fisher Scoring iterations: 6
exp(cbind(coef(poisson13a), confint(poisson13a)))
## 2.5 % 97.5 %
## (Intercept) 0.01359599 0.009161455 0.01927158
## hienghigh 0.52035993 0.282917136 0.93283916
## Create two new variables for age
## age at study entry
diet$entry_age <- as.numeric(diet$doe - diet$dob) / 365.24
## age at study exit
diet$exit_age <- as.numeric(diet$dox - diet$dob) / 365.24
## Use the new age variables to provide
## counting process data to coxph
cox13b <- coxph(Surv(entry_age, exit_age, chd) ~ hieng, data=diet, ties="breslow")
summary(cox13b)
## Call:
## coxph(formula = Surv(entry_age, exit_age, chd) ~ hieng, data = diet,
## ties = "breslow")
##
## n= 337, number of events= 46
##
## coef exp(coef) se(coef) z Pr(>|z|)
## hienghigh -0.6113 0.5426 0.3028 -2.019 0.0435 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## hienghigh 0.5426 1.843 0.2998 0.9823
##
## Concordance= 0.584 (se = 0.038 )
## Rsquare= 0.012 (max possible= 0.755 )
## Likelihood ratio test= 4.2 on 1 df, p=0.04049
## Wald test = 4.08 on 1 df, p=0.04349
## Score (logrank) test = 4.2 on 1 df, p=0.04035