Survival analysis for telecom rollover using R
I am working on Telecom Churn problem and here is my dataset.
http://www.sgi.com/tech/mlc/db/churn.data
Names - http://www.sgi.com/tech/mlc/db/churn.names
I am new to survival analysis. Given the training data, my idea is to build a survival model to estimate survival time along with churn / sewage prediction on test data based on independent factors. Can anyone help me with some code or pointers on how to solve this problem.
To be precise, let's say my train data got
customer consumption data, details of the plan, its duration, etc. and whether it was canceled or not.
Using general classification models, I can predict rejection or not on test data. Now, using survival analysis, I want to predict the survival time in the test data.
Thanks, Maddy
source to share
If you're still interested (or in the interest of those who come later), I've written a few tutorials specifically for doing survival analysis of customer rejection data using R. They cover many different analytical methods, sample data, and R code.
Basic Survival Analysis: http://daynebatten.com/2015/02/customer-churn-survival-analysis/
Basic Cox Regression: http://daynebatten.com/2015/02/customer-churn-cox-regression/
Temporal covariates in cox regression: http://daynebatten.com/2015/12/survival-analysis-customer-churn-time-varying-covariates/
Time coefficients in cox regression: http://daynebatten.com/2016/01/customer-churn-time-dependent-coefficients/
Limited mean survival time (quantify the impact of churn in dollar terms): http://daynebatten.com/2015/03/customer-churn-restricted-mean-survival-time/
Pseudo Observations (quantifies dollar gains / losses associated with effects of variables): http://daynebatten.com/2015/03/customer-churn-pseudo-observations/
Please forgive the stupid imagery.
source to share
Here's some code to get you started:
Read the data first
nm <- read.csv("http://www.sgi.com/tech/mlc/db/churn.names",
skip=4, colClasses=c("character", "NULL"), header=FALSE, sep=":")[[1]]
dat <- read.csv("http://www.sgi.com/tech/mlc/db/churn.data", header=FALSE, col.names=c(nm, "Churn"))
Use Surv()
to set up a survival object for simulation.
library(survival)
s <- with(dat, Surv(account.length, as.numeric(Churn)))
Set up a Cox proportional hazards model and plot the result
model <- coxph(s ~ total.day.charge + number.customer.service.calls, data=dat[, -4])
summary(model)
plot(survfit(model))
Add a layer:
model <- coxph(s ~ total.day.charge + strata(number.customer.service.calls <= 3), data=dat[, -4])
summary(model)
plot(survfit(model), col=c("blue", "red"))
source to share