Plot with overlapping points
I have data in R with overlapping points.
x = c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8) y = c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2) plot(x,y)
How can I plot these points so that the points that overlap are proportionally larger than the points that are not. For example, if 3 points lie at point (4,5), then the point at position (4,5) should be three times larger than a point with one point.
source to share
Here's one way ggplot2
::
x = c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8)
y = c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2)
df <- data.frame(x = x,y = y)
ggplot(data = df,aes(x = x,y = y)) + stat_sum()
stat_sum
Uses a fraction of instances by default . You can use raw counts instead by doing something like:
ggplot(data = df,aes(x = x,y = y)) + stat_sum(aes(size = ..n..))
source to share
## Tabulate the number of occurrences of each cooordinate
df <- data.frame(x, y)
df2 <- cbind(unique(df), value = with(df, tapply(x, paste(x,y), length)))
## Use cex to set point size to some function of coordinate count
## (By using sqrt(value), the _area_ of each point will be proportional
## to the number of observations it represents)
plot(y ~ x, cex = sqrt(value), data = df2, pch = 16)
source to share
You need to add a parameter cex
to the graph function. First, I would use the as.data.frame
and function table
to reduce your data to unique (x, y) pairs and their frequencies:
new.data = as.data.frame(table(x,y))
new.data = new.data[new.data$Freq != 0,] # Remove points with zero frequency
The only downside to this is that it converts numeric data to factors. So convert back to numeric and graph!
plot(as.numeric(new.data$x), as.numeric(new.data$y), cex = as.numeric(new.data$Freq))
source to share
Let me suggest alternatives for tweaking the point size. One of the downsides to using size (radius? Area?) Is that the reader's estimate of the spot size versus the base numeric value is subjective.
So option 1: paint each point with transparency --- ninja'd by Tyler! Option 2: Use jitter
to shift the data slightly so that the overlaid points do not overlap.
source to share