Add a freeform curve with graph AUC 0.8 to ROC
I have a simple ROC graph that I create using the pROC package:
plot.roc(response, predictor)
It works fine as expected, but I would like to add a "perfect" formula curve with an AUC of 0.8 for comparison (the AUC of my ROC plot is 0.66).
Any thoughts?
Just to clarify, I am not trying to smooth out my ROC intent, but I am trying to add a reference curve that will represent AUC 0.8 (similar to the reference diagonal line representing AUC 0.5).
source to share
The diagonal line reference makes sense (a model that guesses at random), so you should also define a model associated with the 0.8 AUC reference curve. Different models will be associated with different datum curves.
For example, you can define a model for which the predicted probabilities are distributed evenly between 0 and 1 and for a point with a predicted probability p the probability of a true outcome is p ^ k for some constant k. It looks like for this model k = 2 gives a plot with an AUC of 0.8.
library(pROC)
set.seed(144)
probs <- seq(0, 1, length.out=10000)
truth <- runif(10000)^2 < probs
plot.roc(truth, probs)
# Call:
# plot.roc.default(x = truth, predictor = probs)
#
# Data: probs in 3326 controls (truth FALSE) < 6674 cases (truth TRUE).
# Area under the curve: 0.7977
Some algebra shows that this particular family of models has an AUC (2 + 3k) / (2 + 4k) which means it can generate curves with AUC between 0.75 and 1 depending on the value of k.
Another approach you can use is related to logistic regression. If you had a logistic regression linear predictor function value p, otherwise you would have predicted the probability 1 / (1 + exp (-p)), then you could mark the true result as true if p plus some normally distributed noise is greater than 0, and otherwise, label the true result as false. If the normally distributed noise has a variance of 0, your model will have an AUC of 1, and if the normally distributed noise has a deviation approaching infinity, your model will have an AUC of 0.5.
If I assume the original predictions are from a standard normal distribution, this looks like a normally distributed noise with a standard deviation of 1.2, gives an AUC of 0.8 (although I couldn't find a nice closed form for the AUC):
set.seed(144)
pred.fxn <- rnorm(10000)
truth <- (pred.fxn + rnorm(10000, 0, 1.2)) >= 0
plot.roc(truth, pred.fxn)
# Call:
# plot.roc.default(x = truth, predictor = pred.fxn)
#
# Data: pred.fxn in 5025 controls (truth FALSE) < 4975 cases (truth TRUE).
# Area under the curve: 0.7987
source to share
A quick / crude way is to add a circle of radius 1 to your graph, which will have AUC pi / 4 = 0.7853982
library(pROC)
library(car)
n <- 100L
x1 <- rnorm(n, 2.0, 0.5)
x2 <- rnorm(n, -1.0, 2)
y <- rbinom(n, 1L, plogis(-0.4 + 0.5 * x1 + 0.1 * x2))
mod <- glm(y ~ x1 + x2, "binomial")
probs <- predict(mod, type = "response")
plot(roc(y, probs))
ellipse(c(0, 0), matrix(c(1,0,0,1), 2, 2), radius = 1, center.pch = FALSE, col = "blue")
source to share