How can I vectorize this loop difference in numpy?

I feel like there must be a quick way to speed up this code. I think the answer is here , but I cannot find my problem in this format. The main problem I'm trying to solve is to find the exact difference in terms of parallel and perpendicular components and create a 2d histogram of those differences.

out = np.zeros((len(rpbins)-1,len(pibins)-1))
tmp = np.zeros((len(x),2))
for i in xrange(len(x)):
    tmp[:,0] = x - x[i]
    tmp[:,1] = y - y[i]

    para = np.sum(tmp**2,axis=-1)**(1./2)
    perp = np.abs(z - z[i])

    H, _, _ = np.histogram2d(para, perp, bins=[rpbins, pibins])
    out += H

      

+3


source to share


1 answer


Vectorizing things like this is tricky because to get rid of the loop over the elements n

you need to build an array (n, n)

, so for large inputs you will probably get worse performance than in a Python loop, but it can be done:

mask = np.triu_indices(x.shape[0], 1)
para = np.sqrt((x[:, None] - x)**2 + (y[:, None] - y)**2)
perp = np.abs(z[:, None] - z)
hist, _, _ = np.histogram2d(para[mask], perp[mask], bins=[rpbins, pibins])

      

mask

is not to count each distance twice. I also set the diagonal offset to 1

to avoid including the distances of 0

each point to itself in the histogram. But if you don't index para

and perp

with it, you get the same result as your code.

With example data:

items = 100
rpbins, pibins = np.linspace(0, 1, 3), np.linspace(0, 1, 3)
x = np.random.rand(items)
y = np.random.rand(items)
z = np.random.rand(items)

      

I get this for mine hist

and yours out

:



>>> hist
array([[ 1795.,   651.],
       [ 1632.,   740.]])
>>> out
array([[ 3690.,  1302.],
       [ 3264.,  1480.]])

      

and out[i, j] = 2 * hist[i, j]

except i = j = 0

where out[0, 0] = 2 * hist[0, 0] + items

because of the distance of 0

each element to itself.


EDIT Tested the following after tcaswell's comment:

items = 1000
rpbins, pibins = np.linspace(0, 1, 3), np.linspace(0, 1, 3)
x, y, z = np.random.rand(3, items)

def hist1(x, y, z, rpbins, pibins) :
    mask = np.triu_indices(x.shape[0], 1)
    para = np.sqrt((x[:, None] - x)**2 + (y[:, None] - y)**2)
    perp = np.abs(z[:, None] - z)
    hist, _, _ = np.histogram2d(para[mask], perp[mask], bins=[rpbins, pibins])
    return hist

def hist2(x, y, z, rpbins, pibins) :
    mask = np.triu_indices(x.shape[0], 1)
    para = np.sqrt((x[:, None] - x)[mask]**2 + (y[:, None] - y)[mask]**2)
    perp = np.abs((z[:, None] - z)[mask])
    hist, _, _ = np.histogram2d(para, perp, bins=[rpbins, pibins])
    return hist

def hist3(x, y, z, rpbins, pibins) :
    mask = np.triu_indices(x.shape[0], 1)
    para = np.sqrt(((x[:, None] - x)**2 + (y[:, None] - y)**2)[mask])
    perp = np.abs((z[:, None] - z)[mask])
    hist, _, _ = np.histogram2d(para, perp, bins=[rpbins, pibins])
    return hist

In [10]: %timeit -n1 -r10 hist1(x, y, z, rpbins, pibins)
1 loops, best of 10: 289 ms per loop

In [11]: %timeit -n1 -r10 hist2(x, y, z, rpbins, pibins)
1 loops, best of 10: 294 ms per loop

In [12]: %timeit -n1 -r10 hist3(x, y, z, rpbins, pibins)
1 loops, best of 10: 278 ms per loop

      

It seems like most of the time is spent creating new arrays rather than actual calculations, so while there is some efficiency to clean up, there really aren't that many.

+3


source







All Articles