Clustering of images to evaluate diversity (Weka?)

Within a university course I have some features of images (as text files). I have to rank those images according to their diversity.#

The idea I have in mind is to feed a k-means classifier with the images and then compute the euclidian-distance from the images within a cluster to the cluster's centroïd. Then do a rotation between clusters and take always the (next) closest image to the centroïd. I.e., return closest to centroïd 1, then closest to centroïd 2, then 3.... then second closest to centroïd 1, 2, 3 and so on.

First question: would this be a clever approach? Or am I on the wrong path?

Second question: I'm a bit confused. I thought I'd feed the data to Weka and it'd tell me "hey, if I were you, I'd split this data into 7 clusters", or something like that. I mean, that it'd be able to give me some information about the clusters I need. Instead, to use simplekmeans I'm supposed to know a priori how many clusters I'll use... how could I possibly know that?

One example of what I mean: let's say I have 3 mono-color images: light-blue, blue, red. I thought Weka would notice that the 2 blues are similar and cluster them together.

Btw I'm kind of new to Weka (as you might have seen) so if you could provide some information on which functions I miggt want to use (and why :P) I'd be grateful! Thank you!

Simple K-means - is an algorithm where you have to specify a number of the possible clusters in the data set.

If you don't know how many clusters there might be, it's better to get different algorithm or find out a number of the clusters.

You can use X-means -there you don't need to specify k parameter. (http://weka.sourceforge.net/doc.packages/XMeans/weka/clusterers/XMeans.html)

X-Means is K-Means extended by an Improve-Structure part In this part of the algorithm the centers are attempted to be split in its region. The decision between the children of each center and itself is done comparing the BIC-values of the two structures.

or you can observe a cut point chart based on AHC - hierarchical clustering algorithm (https://en.wikipedia.org/wiki/Hierarchical_clustering) and then deduct a number of the clusters