Custom linear model geom_smooth

While looking at this issue, I am unable to specify a custom linear model for geom_smooth

. My code looks like this:

example.label <- c("A","A","A","A","A","B","B","B","B","B")
example.value <- c(5, 4, 4, 5, 3, 8, 9, 11, 10, 9)
example.age <- c(30, 40, 50, 60, 70, 30, 40, 50, 60, 70)
example.score <- c(90,95,89,91,85,83,88,94,83,90)
example.data <- data.frame(example.label, example.value,example.age,example.score)

p = ggplot(example.data, aes(x=example.age,
                         y=example.value,color=example.label)) +
  geom_point()
  #geom_smooth(method = lm)

cf = function(dt){
  lm(example.value ~example.age+example.score, data = dt)
}

cf(example.data)

p_smooth <- by(example.data, example.data$example.label, 
               function(x) geom_smooth(data=x, method = lm, formula = cf(x)))

p + p_smooth 

      

I am getting this error / warning:

Warning messages:
1: Computation failed in `stat_smooth()`:
object 'weight' not found 
2: Computation failed in `stat_smooth()`:
object 'weight' not found 

      

Why am I getting this? And what is the correct method to point a custom model to geom_smooth

. Thank.

+3


source to share


1 answer


The regression function for a regression model with two continuous predictor variables and a continuous result lives in 3D space (two for the predictors, one for the result), whereas a ggplot graph is a two-dimensional space (one continuous predictor per x-axis and a result on the y-axis). This is the main reason why you cannot plot a function of two continuous predictor variables with geom_smooth

.

One "workaround" is to select a few specific values ​​for one of the continuous predictor variables and then plot a line for another continuous predictor on the x-axis for each of the selected values ​​of the first variable.

Here's an example with a data frame mtcars

. The regression model below predicts mpg

using wt

and hp

. Then we'll build predictions mpg

vs. wt

for different values hp

. We create a prediction dataframe and then plot using geom_line

. Each line in the graph represents a regression prediction for mpg

vs. wt

for different values hp

. Of course, you can also revoke roles wt

and hp

.

library(ggplot)
theme_set(theme_classic())

d = mtcars
m2 = lm(mpg ~ wt + hp, data=d)

pred.data = expand.grid(wt = seq(min(d$wt), max(d$wt), length=20),
                        hp = quantile(d$hp))
pred.data$mpg = predict(m2, newdata=pred.data)

ggplot(pred.data, aes(wt, mpg, colour=factor(hp))) +
  geom_line() +
  labs(colour="HP Quantiles")

      



enter image description here

Another option is to use a color gradient to represent mpg

(result) and plot wt

both hp

the x and y axes:

pred.data = expand.grid(wt = seq(min(d$wt), max(d$wt), length=100),
                        hp = seq(min(d$hp), max(d$hp), length=100))
pred.data$mpg = predict(m2, newdata=pred.data)

ggplot(pred.data, aes(wt, hp, z=mpg, fill=mpg)) +
  geom_tile() +
  scale_fill_gradient2(low="red", mid="yellow", high="blue", midpoint=median(pred.data$mpg)) 

      

enter image description here

+2


source







All Articles