I have a vector of integers which I wish to divide into clusters so that the distance between any two clusters is greater than a lower bound, and within any cluster, the distance between two elements is less than an upper bound.
For example, suppose we have the following vector:
1, 4, 5, 6, 9, 29, 32, 36
And set the aforementioned lower bound and upper bound to 19 and 9 respectively, the two vectors below should be a possible result:
1, 4, 5, 6, 9
29, 32, 36
Thanks to @flodel 's comments, I realized this kind of clustering may be impossible. So I would like to modify the questions a bit:
What are the possible clustering methods if I impose only the between cluster distance lower bound? What are the possible clustering methods if I impose only the within cluster distance upper bound?
Here's a simple algorithm that will work, explained conceptually (implementation details omitted):
lower_bound
apart. These mark all the possible cluster boundaries.left_marker
andright_marker
, check if the distance between the element immediately to the right of theleft_marker
and the element immediately to the left of theright_marker
is less thanupper_bound
apart.Applying this to your example, we get:
EDIT: Original poster relaxed the conditions of the problem.
If you only want to satisfy the lower bound condition:
lower_bound
apart.The following gets you step 2 assuming your vector is already sorted:
What are the possible clustering methods if I impose only the between cluster distance lower bound?
Hierarchical clustering with single linkage:
What are the possible clustering methods if I impose only the within cluster distance upper bound?
Hierarchical clustering with complete linkage: