How do I do this in numPy?

I have an array of X 3D coordinates of N points (N * 3) and want to calculate the eukelian distance between each pair of points.

I can do this by iterating over X and comparing them to the threshold.

coords = array([v.xyz for v in vertices])
for vertice in vertices:
    tests = np.sum(array(coords - vertice.xyz) ** 2, 1) < threshold
    closest = [v for v, t in zip(vertices, tests) if t]

      

Can this be done in one operation? I remember linear algebra 10 years ago and can't find a way to do it.

It should probably be a 3D array (point a, point b, axis) and then summed using dimension axis

.

edit: Found a solution on my own, but it doesn't work with large datasets.

    coords = array([v.xyz for v in vertices])
    big = np.repeat(array([coords]), len(coords), 0)
    big_same = np.swapaxes(big, 0, 1)
    tests = np.sum((big - big_same) ** 2, 0) < thr_square

    for v, test_vector in zip(vertices, tests):
        v.closest = self.filter(vertices, test_vector)

      

+3


source to share


3 answers


Use scipy.spatial.distance

. If X

is an array of points n

× 3, you can get a matrix of distances n

× n

from

from scipy.spatial import distance
D = distance.squareform(distance.pdist(X))

      

Then the i

point with the index is closest to the point



np.argsort(D[i])[1]

      

( [1]

Skips the value in the diagonal, which will be returned first.)

+2


source


I'm not really sure what you are asking here. If you are calculating the Euclidean distance between each pair of points in N-point space, it would make sense for me to represent the results as a search matrix. So for N points, you get an NxN symmetric matrix. Element (3, 5) will represent the distance between points 3 and 5, while element (2, 2) will be the distance between point 2 and (zero) itself. This is how I would do it for random points:



import numpy as np

N = 5 

coords = np.array([np.random.rand(3) for _ in range(N)])
dist = np.zeros((N, N)) 

for i in range(N):
    for j in range(i, N): 
        dist[i, j] = np.linalg.norm(coords[i] - coords[j])
        dist[j, i] = dist[i, j]

print dist

      

0


source


If xyz is an array with your coordinates, then the following code will calculate the distance matrix (works fast until you have enough memory to store N ^ 2 distances):

xyz = np.random.uniform(size=(1000,3))
distances = (sum([(xyzs[:,i][:,None]-xyzs[:,i][None,:])**2 for i in range(3)]))**.5

      

0


source







All Articles