Plot with overlapping points

I have data in R with overlapping points.

x = c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8)
y = c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2)
plot(x,y)

      

How can I plot these points so that the points that overlap are proportionally larger than the points that are not. For example, if 3 points lie at point (4,5), then the point at position (4,5) should be three times larger than a point with one point.

+3


source to share


8 answers


Here's a simpler (I think) solution:



x <- c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8)
y <- c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2)
size <- sapply(1:length(x), function(i) { sum(x==x[i] & y==y[i]) })
plot(x,y, cex=size)

      

+6


source


Here's one way ggplot2

::

x = c(4,4,4,7,3,7,3,8,6,8,9,1,1,1,8)
y = c(5,5,5,2,1,2,5,2,2,2,3,5,5,5,2)
df <- data.frame(x = x,y = y)
ggplot(data = df,aes(x = x,y = y)) + stat_sum()

      

enter image description here



stat_sum

Uses a fraction of instances by default . You can use raw counts instead by doing something like:

ggplot(data = df,aes(x = x,y = y)) + stat_sum(aes(size = ..n..))

      

+8


source


## Tabulate the number of occurrences of each cooordinate
df <- data.frame(x, y)
df2 <- cbind(unique(df), value = with(df, tapply(x, paste(x,y), length)))

## Use cex to set point size to some function of coordinate count
## (By using sqrt(value), the _area_ of each point will be proportional
##  to the number of observations it represents)
plot(y ~ x, cex = sqrt(value), data = df2, pch = 16)

      

enter image description here

+5


source


You didn't really ask for this, but alpha might be another way to solve this problem:

library(ggplot2)
ggplot(data.frame(x=x, y=y), aes(x, y)) + geom_point(alpha=.3, size = 3)

      

enter image description here

+4


source


You need to add a parameter cex

to the graph function. First, I would use the as.data.frame

and function table

to reduce your data to unique (x, y) pairs and their frequencies:

new.data = as.data.frame(table(x,y))
new.data = new.data[new.data$Freq != 0,] # Remove points with zero frequency

      

The only downside to this is that it converts numeric data to factors. So convert back to numeric and graph!

plot(as.numeric(new.data$x), as.numeric(new.data$y), cex = as.numeric(new.data$Freq))

      

+3


source


You can also try sunflowerplot

.

sunflowerplot(x,y)

      

enter image description here

+2


source


Let me suggest alternatives for tweaking the point size. One of the downsides to using size (radius? Area?) Is that the reader's estimate of the spot size versus the base numeric value is subjective.

So option 1: paint each point with transparency --- ninja'd by Tyler! Option 2: Use jitter

to shift the data slightly so that the overlaid points do not overlap.

+1


source


Solution using lattice

and table

(similar to @R_User, but no need to remove 0 as hash does the job)

   dt <-  as.data.frame(table(x,y))
   xyplot(dt$y~dt$x, cex = dt$Freq^2, col =dt$Freq)

      

enter image description here

0


source







All Articles