How can I vectorize this loop difference in numpy?
I feel like there must be a quick way to speed up this code. I think the answer is here , but I cannot find my problem in this format. The main problem I'm trying to solve is to find the exact difference in terms of parallel and perpendicular components and create a 2d histogram of those differences.
out = np.zeros((len(rpbins)-1,len(pibins)-1))
tmp = np.zeros((len(x),2))
for i in xrange(len(x)):
tmp[:,0] = x - x[i]
tmp[:,1] = y - y[i]
para = np.sum(tmp**2,axis=-1)**(1./2)
perp = np.abs(z - z[i])
H, _, _ = np.histogram2d(para, perp, bins=[rpbins, pibins])
out += H
source to share
Vectorizing things like this is tricky because to get rid of the loop over the elements n
you need to build an array (n, n)
, so for large inputs you will probably get worse performance than in a Python loop, but it can be done:
mask = np.triu_indices(x.shape[0], 1) para = np.sqrt((x[:, None] - x)**2 + (y[:, None] - y)**2) perp = np.abs(z[:, None] - z) hist, _, _ = np.histogram2d(para[mask], perp[mask], bins=[rpbins, pibins])
mask
is not to count each distance twice. I also set the diagonal offset to 1
to avoid including the distances of 0
each point to itself in the histogram. But if you don't index para
and perp
with it, you get the same result as your code.
With example data:
items = 100 rpbins, pibins = np.linspace(0, 1, 3), np.linspace(0, 1, 3) x = np.random.rand(items) y = np.random.rand(items) z = np.random.rand(items)
I get this for mine hist
and yours out
:
>>> hist
array([[ 1795., 651.],
[ 1632., 740.]])
>>> out
array([[ 3690., 1302.],
[ 3264., 1480.]])
and out[i, j] = 2 * hist[i, j]
except i = j = 0
where out[0, 0] = 2 * hist[0, 0] + items
because of the distance of 0
each element to itself.
EDIT Tested the following after tcaswell's comment:
items = 1000
rpbins, pibins = np.linspace(0, 1, 3), np.linspace(0, 1, 3)
x, y, z = np.random.rand(3, items)
def hist1(x, y, z, rpbins, pibins) :
mask = np.triu_indices(x.shape[0], 1)
para = np.sqrt((x[:, None] - x)**2 + (y[:, None] - y)**2)
perp = np.abs(z[:, None] - z)
hist, _, _ = np.histogram2d(para[mask], perp[mask], bins=[rpbins, pibins])
return hist
def hist2(x, y, z, rpbins, pibins) :
mask = np.triu_indices(x.shape[0], 1)
para = np.sqrt((x[:, None] - x)[mask]**2 + (y[:, None] - y)[mask]**2)
perp = np.abs((z[:, None] - z)[mask])
hist, _, _ = np.histogram2d(para, perp, bins=[rpbins, pibins])
return hist
def hist3(x, y, z, rpbins, pibins) :
mask = np.triu_indices(x.shape[0], 1)
para = np.sqrt(((x[:, None] - x)**2 + (y[:, None] - y)**2)[mask])
perp = np.abs((z[:, None] - z)[mask])
hist, _, _ = np.histogram2d(para, perp, bins=[rpbins, pibins])
return hist
In [10]: %timeit -n1 -r10 hist1(x, y, z, rpbins, pibins)
1 loops, best of 10: 289 ms per loop
In [11]: %timeit -n1 -r10 hist2(x, y, z, rpbins, pibins)
1 loops, best of 10: 294 ms per loop
In [12]: %timeit -n1 -r10 hist3(x, y, z, rpbins, pibins)
1 loops, best of 10: 278 ms per loop
It seems like most of the time is spent creating new arrays rather than actual calculations, so while there is some efficiency to clean up, there really aren't that many.
source to share