Add a freeform curve with graph AUC 0.8 to ROC

Question

Add a freeform curve with graph AUC 0.8 to ROC

I have a simple ROC graph that I create using the pROC package:

plot.roc(response, predictor)

It works fine as expected, but I would like to add a "perfect" formula curve with an AUC of 0.8 for comparison (the AUC of my ROC plot is 0.66).

Any thoughts?

Just to clarify, I am not trying to smooth out my ROC intent, but I am trying to add a reference curve that will represent AUC 0.8 (similar to the reference diagonal line representing AUC 0.5).

+3

r auc roc

Oposum Apr 16 15 at 23:27

source to share

2 answers

A quick / crude way is to add a circle of radius 1 to your graph, which will have AUC pi / 4 = 0.7853982

library(pROC)
library(car)

n <- 100L

x1 <- rnorm(n, 2.0, 0.5)
x2 <- rnorm(n, -1.0, 2)
y <- rbinom(n, 1L, plogis(-0.4 + 0.5 * x1 + 0.1 * x2))

mod <- glm(y ~ x1 + x2, "binomial")
probs <- predict(mod, type = "response")

plot(roc(y, probs))
ellipse(c(0, 0), matrix(c(1,0,0,1), 2, 2), radius = 1, center.pch = FALSE, col = "blue")

roc

+2

Jeff Apr 17 15 at 18:03

source to share

josliber · Accepted Answer · 2015-04-17T01:37:20+0000

The diagonal line reference makes sense (a model that guesses at random), so you should also define a model associated with the 0.8 AUC reference curve. Different models will be associated with different datum curves.

For example, you can define a model for which the predicted probabilities are distributed evenly between 0 and 1 and for a point with a predicted probability p the probability of a true outcome is p ^ k for some constant k. It looks like for this model k = 2 gives a plot with an AUC of 0.8.

library(pROC)
set.seed(144)
probs <- seq(0, 1, length.out=10000)
truth <- runif(10000)^2 < probs
plot.roc(truth, probs)
# Call:
# plot.roc.default(x = truth, predictor = probs)
# 
# Data: probs in 3326 controls (truth FALSE) < 6674 cases (truth TRUE).
# Area under the curve: 0.7977

enter image description here

Some algebra shows that this particular family of models has an AUC (2 + 3k) / (2 + 4k) which means it can generate curves with AUC between 0.75 and 1 depending on the value of k.

Another approach you can use is related to logistic regression. If you had a logistic regression linear predictor function value p, otherwise you would have predicted the probability 1 / (1 + exp (-p)), then you could mark the true result as true if p plus some normally distributed noise is greater than 0, and otherwise, label the true result as false. If the normally distributed noise has a variance of 0, your model will have an AUC of 1, and if the normally distributed noise has a deviation approaching infinity, your model will have an AUC of 0.5.

If I assume the original predictions are from a standard normal distribution, this looks like a normally distributed noise with a standard deviation of 1.2, gives an AUC of 0.8 (although I couldn't find a nice closed form for the AUC):

set.seed(144)
pred.fxn <- rnorm(10000)
truth <- (pred.fxn + rnorm(10000, 0, 1.2)) >= 0
plot.roc(truth, pred.fxn)
# Call:
# plot.roc.default(x = truth, predictor = pred.fxn)
# 
# Data: pred.fxn in 5025 controls (truth FALSE) < 4975 cases (truth TRUE).
# Area under the curve: 0.7987

enter image description here

Add a freeform curve with graph AUC 0.8 to ROC

More articles: