Survival analysis for telecom rollover using R

I am working on Telecom Churn problem and here is my dataset.

http://www.sgi.com/tech/mlc/db/churn.data

Names - http://www.sgi.com/tech/mlc/db/churn.names

I am new to survival analysis. Given the training data, my idea is to build a survival model to estimate survival time along with churn / sewage prediction on test data based on independent factors. Can anyone help me with some code or pointers on how to solve this problem.

To be precise, let's say my train data got

customer consumption data, details of the plan, its duration, etc. and whether it was canceled or not.

Using general classification models, I can predict rejection or not on test data. Now, using survival analysis, I want to predict the survival time in the test data.

Thanks, Maddy

+3


source to share


2 answers


If you're still interested (or in the interest of those who come later), I've written a few tutorials specifically for doing survival analysis of customer rejection data using R. They cover many different analytical methods, sample data, and R code.

Basic Survival Analysis: http://daynebatten.com/2015/02/customer-churn-survival-analysis/

Basic Cox Regression: http://daynebatten.com/2015/02/customer-churn-cox-regression/

Temporal covariates in cox regression: http://daynebatten.com/2015/12/survival-analysis-customer-churn-time-varying-covariates/



Time coefficients in cox regression: http://daynebatten.com/2016/01/customer-churn-time-dependent-coefficients/

Limited mean survival time (quantify the impact of churn in dollar terms): http://daynebatten.com/2015/03/customer-churn-restricted-mean-survival-time/

Pseudo Observations (quantifies dollar gains / losses associated with effects of variables): http://daynebatten.com/2015/03/customer-churn-pseudo-observations/

Please forgive the stupid imagery.

+9


source


Here's some code to get you started:

Read the data first

nm <- read.csv("http://www.sgi.com/tech/mlc/db/churn.names", 
               skip=4, colClasses=c("character", "NULL"), header=FALSE, sep=":")[[1]]
dat <- read.csv("http://www.sgi.com/tech/mlc/db/churn.data", header=FALSE, col.names=c(nm, "Churn"))

      

Use Surv()

to set up a survival object for simulation.

library(survival)

s <- with(dat, Surv(account.length, as.numeric(Churn)))

      

Set up a Cox proportional hazards model and plot the result



model <- coxph(s ~ total.day.charge + number.customer.service.calls, data=dat[, -4])
summary(model)
plot(survfit(model))

      

enter image description here

Add a layer:

model <- coxph(s ~ total.day.charge + strata(number.customer.service.calls <= 3), data=dat[, -4])
summary(model)
plot(survfit(model), col=c("blue", "red"))

      

enter image description here

+4


source







All Articles