Scale_y_log10 () and coord_trans (ytrans = 'log10') lead to different results

I am using log transforms for my statistical analyzes (reaction time) and now I want to plot my data using the log y-axis. When I use coord_trans (ytrans = "log10") that gives me correct results, but I need bars instead of points for my chart. When I use scale_y_log10 () it works with bars, but it calculates the wrong values โ€‹โ€‹(bar1 has an average of 833 but displays above 900, bar2 has an average of 568 but shows closer to 500).

set.seed(10)

bar1 <- abs(rnorm(n = 232, mean = 833, sd = 1103)) + 1
bar2 <- abs(rnorm(n = 393, mean = 568, sd = 418)) + 1

graph_data <- data.frame(RT = c(bar1, bar2), group = c(rep(1, 232), rep(2, 393)))

ggplot(graph_data, aes(group, RT)) +
stat_summary(fun.y = mean, geom = 'point', position = 'dodge') +
stat_summary(fun.data = mean_cl_normal, geom = 'pointrange', position = 'position_dodge'(width = .9)) +
coord_trans(ytrans = "log10")

ggplot(graph_data, aes(group, RT)) +
stat_summary(fun.y = mean, geom = 'bar', position = 'dodge') +
stat_summary(fun.data = mean_cl_normal, geom = 'pointrange', position = 'position_dodge'(width = .9)) +
scale_y_log10(breaks = seq(300, 1000, 100))

      

Thanks for the help!

+3


source to share


1 answer


There are two reasons why you got different values.

First, if you look at the help page coord_trans()

, you will see that:

coord_trans differs from scale transformations in that what happens after a statistical transformation will affect the appearance of geometries - there is no guarantee that straight lines will continue to be straight.



This means that when coord_trans()

only the coordinates (y-axis) affect log10, but with scale_y_log10()

your actual data is converted to the log before other calculations.

Second, your data has negative values, and when you apply scale_y_log10()

to your data, those values โ€‹โ€‹are removed and all calculations are done on only a portion of your data, so the average you get is greater than coord_trans()

.

Warning messages:
1: In scale$trans$trans(x) : NaNs produced
2: In scale$trans$trans(x) : NaNs produced
3: Removed 100 rows containing missing values (stat_summary). 
4: Removed 100 rows containing missing values (stat_summary). 

      

+4


source







All Articles