Scikit-learn kmeans custom distance

Question

Scikit-learn kmeans custom distance

I want to use the kmeans algorithm to cluster some data, but I would like to use a custom distance function. Is there a way to change the distance function scikit-learn uses?

I would also settle for another framework / module that would allow the distance function to be exchanged and could compute the parallels in parallel (I would like to speed up the computation, which is a nice feature from scikit-learn)

Any suggestions?

+3

python scikit-learn

Nils ziehn June 29. 15 at 23:22

source to share

1 answer

Florian gauthier · Answer 1 · 2015-06-30T14:57:01+0000

You can try the spectral clustering algorithm, which allows you to enter your own distance matrix (calculated as you like).

Its performance has nothing to envy for K-means on convex boundaries, but it also does work on non-convex problems (detects connectivity). More details here .

The good news is that spectral clustering is also implemented in scikit-learn .

Hope it helps.

Scikit-learn kmeans custom distance

More articles: