Scatter plot with alpha, still opaque over areas where spots are dense
I have a scatter plot that draws a very large number of points from two different datasets. In some areas, there are a huge number of points, so even at very low alpha (for example, alpha = 0.1) you cannot see through the mass. But with this alpha you can barely see the points in the sparse regions. Is there a way to bind the alpha for the stacked points, or somehow make the background visible under dense areas without washing out the sparse areas?
The code snippet looks like this:
# Code to populate the datasets not included.
fig, ax = plt.subplots()
ax.scatter(x1, y1, s=12, color='red')
ax.scatter(x2, y2, s=12, color='blue', alpha=0.1)
# Plus code to do xlabels and such not included.
to produce this:
As you can see, it is difficult to see the boundaries of the lower red leg and still release the upper blue leg.
Is there a way to create this effect?
Thanks in advance.
EDIT
One good suggestion is to use hexbins instead of spread. It sounds promising, but the colors still don't match nicely. For example,
ax.hexbin(x1, y1, cmap='Reds', mincnt=1, vmax=100)
ax.hexbin(x2, y2, cmap='Blues', mincnt=1, vmax=50, alpha=0.8, linewidths=0)
gives:
It would be really nice to have these blues and reds merged. Maybe each pixel could have an R value from one dataset and a B value from another dataset, or something else? But that doesn't look like the hexbin option.
EDIT
After using Thomasillo, answer:
Thanks, I think it looks better than the original.
source to share
1) To improve the hexbin graph, you can use the bins = 'log' option. This calculates the color of the hexagonal binning logarithmically, effectively making the lower numbers stick out better than the higher ones.
2) Calculate the density for each dataset yourself. And from both densities generate color, for example. allowing one density to influence red and another to influence blue. Highlight the result using imshow.
Something like
import matplotlib.pyplot as plt
import numpy as np
import itertools
x1 = np.random.binomial(5100,0.5,51100)
y1 = np.random.binomial(5000,0.7,51100)
x2 = np.random.binomial(5000,0.5,51100)
y2 = np.random.binomial(5000,0.7,51100)
xmin,xmax,xnum = 2350,2700,50
ymin,ymax,ynum = 3350,3700,50
xx,yy=np.mgrid[xmin:xmax:xnum*1j,ymin:ymax:ynum*1j]
def closest_idx(x,y):
idcs = np.argmin((xx-x)**2 + (yy-y)**2)
i_x,i_y = np.unravel_index(idcs, (xnum,ynum) )
return i_x,i_y
def calc_count( xdat,ydat ):
ct = np.zeros_like(xx)
for x,y in itertools.izip(xdat,ydat):
ix,iy = closest_idx(x,y)
ct [ix,iy] += 1
return ct
ct1 = calc_count( x1,y1 )
ct2 = calc_count( x2,y2 )
def color_mix( c1 , c2 ):
cm=np.empty_like(c1)
for i in [0,1,2]:
cm[i] = (c1[i]+c2[i])/2.
return cm
dens1 = ct1 / np.max(ct1)
dens2 = ct2 / np.max(ct2)
ct1_color = np.array([1+0*dens1 , 1-dens1 , 1-dens1 ])
ct2_color = np.array([1-dens2 , 1-dens2 , 1+0*dens2])
col = color_mix( ct1_color , ct2_color )
col = np.transpose( col, axes=(2,1,0))
plt.imshow( col , interpolation='nearest' ,extent=(xmin,xmax,ymin,ymax),origin='lower')
plt.show()
source to share