Scikit-learn kmeans custom distance
I want to use the kmeans algorithm to cluster some data, but I would like to use a custom distance function. Is there a way to change the distance function scikit-learn uses?
I would also settle for another framework / module that would allow the distance function to be exchanged and could compute the parallels in parallel (I would like to speed up the computation, which is a nice feature from scikit-learn)
Any suggestions?
source to share
You can try the spectral clustering algorithm, which allows you to enter your own distance matrix (calculated as you like).
Its performance has nothing to envy for K-means on convex boundaries, but it also does work on non-convex problems (detects connectivity). More details here .
The good news is that spectral clustering is also implemented in scikit-learn .
Hope it helps.
source to share