Parameterized ggplot2 function for histogram / density cannot find object
I have created a histogram / density plot function where I want the y-axis to be count and not density, but I am having trouble parameterizing its bin width.
I am using examples based on http://docs.ggplot2.org/current/geom_histogram.html to illustrate my attempts.
Here's the successful plotMovies1 function. I followed the url link to make the y-axis .. count .. instead of ..density .. Note. that it uses a hardcoded .5 bin width in two places that I want to parameterize ...
# I want y axis as count, rather than density, and followed
# https://stat.ethz.ch/pipermail/r-help/2011-June/280588.html
plotMovies1 <- function() {
m <- ggplot(movies, aes(x = rating))
m <- m + geom_histogram(binwidth = .5)
m <- m + geom_density(aes(y = .5 * ..count..))
}
My first, unsuccessful naive attempt to parameterize the bin width in local bw in plotMovies2 ...
# Failed first attempt to parameterize binwidth
plotMovies2 <- function() {
bw <- .5
m <- ggplot(movies, aes(x = rating))
m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found
m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies2())
Error in eval(expr, envir, enclos) : object 'bw' not found
I see a discussion of passing local environment to aes in ggplot at https://github.com/hadley/ggplot2/issues/743 , but plotMovies3 also doesn't work in the same way without getting bw object ...
# Failed second attempt to parameterize binwidth, even after establishing
# aes environment, per https://github.com/hadley/ggplot2/issues/743
plotMovies3 <- function() {
bw <- .5
m <- ggplot(movies, aes(x = rating), environment = environment())
m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found
m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies3())
Error in eval(expr, envir, enclos) : object 'bw' not found
Finally, I try to set the global, but it still can't find the object ...
# Failed third attempt using global binwidth
global_bw <<- .5
plotMovies4 <- function() {
m <- ggplot(movies, aes(x = rating), environment = environment())
m <- m + geom_histogram(binwidth = global_bw)
# Error in eval(expr, envir, enclos) : object 'global_bw' not found
m <- m + geom_density(aes(y = global_bw * ..count..))
}
> print(plotMovies4())
Error in eval(expr, envir, enclos) : object 'global_bw' not found
Considering plotMovies3 and plotMovies4, I guess this is not a simple environmental problem. Can anyone shed some light on how I can solve this? Again, my goal was to create a histogram / density plot function where
- Its y-axis is a number, not a density, and
- Its bin width can be parameterized (e.g. for control)
source to share
By no means pretty, but if you need a workaround, you can use the regular function density
plotMovies5 <- function(binw=0.5) {
m <- ggplot(movies, aes(x = rating))
m <- m + geom_histogram(binwidth = binw)
wa <- density(x=movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
}
print(plotMovies5(binw=0.25))
Note that you still have to iterate over the variables a bit, as the density estimates are not exactly equal, as shown below:
binw = 0.5
m <- ggplot(movies, aes(x = rating))
m <- m + geom_density(aes(y = 0.5 * ..count..))
wa <- density(x=movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
m
source to share
An alternative is to use the predefined bins with aes_string. Histograms can then be generated using a variable bin width loop:
bins <<- list()
bins[["Variable1"]] <- 2
bins[["Variable2"]] <- 0.5
bins[["Variable3"]] <- 1
print(ggplot(movies, aes(x = rating))+
aes_string(x = "rating", y=paste("..density..*",bins[[i]],sep="")), na.rm=TRUE, position='dodge', binwidth=bins[[i]])
source to share
This is a sequel to mts. It is intended as a long comment: first, the dataset is obtained by loading library("ggplot2movies")
. Second, it can be interesting to loop over multiple values binw
to create a series of numbers that will be used together, such as animation. So what the code below does is just put the mts code in a loop for that purpose. Indeed, a minor contribution.
### Data
library("ggplot2movies")
### Histograms
ggplotMovieHistogram <- function(binw = 0.5) {
require('ggplot2movies')
p <- ggplot(movies, aes(x = rating)) +
geom_histogram(binwidth = binw)
wa <- density(x = movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
return(p)
}
ggsaveMovieHistogram <- function(binw = 0.5, file = 'test.pdf') {
pdf(file, width = 8, height = 8)
print(ggplotMovieHistogram(binw = binw))
dev.off()
}
for(i in seq(0.2, 0.8, by = 0.2)) {
ggsaveMovieHistogram(binw = i,
file = paste0('ggplot-barchart-loop-histogram-',
format(i, decimal.mark = '-'),
'.pdf'))
}
### Densities
library("ggplot2movies")
ggplotMovieDensity <- function(binw = 0.5) {
require('ggplot2movies')
p <- ggplot(movies, aes(x = rating)) +
geom_density(aes(y = 0.5 * ..count..))
wa <- density(x = movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
return(p)
}
ggsaveMovieDensity <- function(binw = 0.5, file = 'test.pdf') {
pdf(file, width = 8, height = 8)
print(ggplotMovieDensity(binw = binw))
dev.off()
}
for(i in seq(0.2, 0.8, by = 0.2)) {
ggsaveMovieDensity(binw = i,
file = paste0('ggplot-barchart-loop-density-',
format(i, decimal.mark = '-'),
'.pdf'))
}
source to share