I am trying to draw a polygon (concave) edge on a K-Means cluster shown below (fig_1).
With @ypnos's help, This piece of code plot everything except the edge.
df = pd.read_csv('https://raw.githubusercontent.com/MachineIntellect/dataset.ml/master/watermelon/watermelon_4_0.csv')
X = df.iloc[:,1:].to_numpy()
m0 = X[5]
m1 = X[11]
m2 = X[23]
centroids = np.array([m0, m1, m2])
labels = pairwise_distances_argmin(X, centroids)
m0 = X[labels == 0].mean(0)
m1 = X[labels == 1].mean(0)
m2 = X[labels == 2].mean(0)
new_centroids = np.array([m0, m1, m2])
plt.xlim(0.1,0.9)
plt.ylim(0, 0.8)
plt.scatter(X[:,0], X[:,1])
plt.scatter(new_centroids[:,0], new_centroids[:,1], c='r', marker = '+')
for i in range(3):
points = X[labels == i]
hull = ConvexHull(points)
for simplex in hull.simplices:
plt.plot(points[simplex, 0], points[simplex, 1], 'r-')
(fig_2)
The scikit-learn doc seems to be inspiring
The question is that the edges pointed by the arrow in fig_1 are different from the correspondence in fig_2.
the edge of the polygon that was being pointed to by the arrow was bent inward (thanks to @dwilli).
Thanks to @ImportanceOfBeingErnest's reminder, scipy.spatial.ConvexHull
may not be able to produce concave.
Is there any other module/package to do this (concave)?
any hint would be appreciated.
alphashape is for this case.
url = 'https://raw.githubusercontent.com/MachineIntellect/dataset.ml/master/watermelon/watermelon_4_0.csv'
df = pd.read_csv(url)
X = df.iloc[:,1:].to_numpy()
m0 = X[5]
m1 = X[11]
m2 = X[23]
centroids = np.array([m0, m1, m2])
labels = pairwise_distances_argmin(X, centroids)
m0 = X[labels == 0].mean(0)
m1 = X[labels == 1].mean(0)
m2 = X[labels == 2].mean(0)
new_centroids = np.array([m0, m1, m2])
fig, ax = plt.subplots()
plt.xlim(0.1,0.9)
plt.ylim(0, 0.8)
ax.scatter(X[:,0], X[:,1])
ax.scatter(new_centroids[:,0], new_centroids[:,1], c='r', marker = '+')
for i in range(3):
points = X[labels == i]
alpha_shape = alphashape.alphashape(points, 5.0)
ax.add_patch(PolygonPatch(alpha_shape, alpha=0.2))
plt.show()
What your inspiration shows is a Voronoi diagram. The coloring shows for any coordinate in the graph, which cluster it would be associated to.
The polygons you show in your first figure are a rough approximation of the convex hull of your cluster members. You could use scipy.spatial.ConvexHull
or cv2.convexHull()
(from OpenCV) to compute it. The documentation of the former also gives an example on how to plot it.
To generate the polygon you can try the below steps
Generate polygons around each cluster treating each cluster as in individual part of the plot.
You can create a rough polygon using the convex hull method mentioned by @ypnos, but to get a better result, have a look at the Delaunay triangulation method
.
You will generate triangular regions between the points based on a set threshold value. The threshold will ensure the best possible fit.
Using this data, you can plot a concave hull using the extreme points. As you don't want the extreme points to be included as the vertices of the polygon, you should add a buffer to go around the points by a set value.
Expected result on some sample data
There's quite a bit of code required to achieve the result, here is a link to a comprehensive guide to generate the sample plot.
Have you tried to define your edges through an active contour model?
Is possibe to find a Skimage implementation of it: https://scikit-image.org/docs/dev/auto_examples/edges/plot_active_contours.html
and another version https://github.com/brikeats/Snakes-in-a-Plane without some preprocessing of the image