How do I find the closest matching vectors between two coordinate matrices?

I have the following problem in Python that I need to solve:

For two coordinate matrices (numPy ndarrays) A

and B

find for all coordinate vectors A

in the A

corresponding coordinate vectors B

in B

such that the Euclidean distance is ||a-b||

minimized. Coordinate matrices A

and B

can have a different number of coordinate vectors (i.e., a different number of rows).

This method should return a matrix of coordinate vectors C

, where the i-th vector C

in C

is a vector from B

, which minimizes the Euclidean distance with the i-th coordinate vector A

in A

.

For example, let's say

A = np.array([[1,1], [3,4]])

and B = np.array([[1,2], [3,6], [8,1]])

Euclidean distances between vector [1,1]

A

and vectors B

are equal:

1, 5.385165, 7

      

So, the first vector in C

will be[1,2]

Similarly, the distances for vector [3,4]

A

and vectors B

are equal:

2.828427, 2, 5.830952  

      

So the second and last vectors in C

will be[3,6]

So C = [[1,2], [3,6]]

How do I code this correctly in Python?

+3


source to share


1 answer


You can use cdist

from scipy.spatial.distance

to get euclidean distances efficiently and then use np.argmin

to get the indices that match the minimum values ​​and use them to index in B

for the final output. Here's the implementation -

import numpy as np
from scipy.spatial.distance import cdist

C = B[np.argmin(cdist(A,B),1)] 

      



Example run -

In [99]: A
Out[99]: 
array([[1, 1],
       [3, 4]])

In [100]: B
Out[100]: 
array([[1, 2],
       [3, 6],
       [8, 1]])

In [101]: B[np.argmin(cdist(A,B),1)]
Out[101]: 
array([[1, 2],
       [3, 6]])

      

+4


source







All Articles