Given N items with values x[1], ..., x[n]
and an integer K find a linear time algorithm to arrange these N items in K non empty groups such that in each group the range (difference between minimum and maximum element values/keys in each group) is minimized and therefore the sum of the ranges is minimum.
For example given N=4
, K=2
and the elements 1 1 4 3
the minimum range is 1
for groups (1,1)
and (4,3)
.
Alright, I'll assume that we want to minimize the sum of differences over all groups.
Let's sort the numbers. There's an optimal answer where each group is a consecutive segment in the sorted array (proof: let A1 < B1 < A2 < B2. We can exchange A2 and B1. The answer will not increase).
Let a[l], a[l + 1], ..., a[r] is a group. It's cost is
a[r] - a[l] = (a[r] - a[r - 1]) + (a[r - 1] - a[r - 2]) + ... + (a[l + 1] - a[l])
. It leads us to a key insight:k
groups isk - 1
gaps and the answer isa[n - 1] - a[0] - sum of gaps
. Thus, we just need to maximize the gaps.Here is a final solution:
k - 1
largest differences. That's exactly where the groups split.k-1
th largest element in linear time (or if we are fine withO(N log N)
time, we can just sort them). That's it.Here is an example:
x = [1, 1, 4, 3], k = 2
sorted:
[1, 1, 3, 4]
differences:
[0, 2, 1]
taking
k - 1 = 1
largest gaps: it's 2. Thus the groups are[1, 1]
and[3, 4]
.A slightly more contrived one:
x = [8, 2, 0, 3], k = 3
sorted:
[0, 2, 3, 8]
differences:
[2, 1, 5]
taking
k - 1 = 2
largest gaps: they're 2 and 5. Thus, the groups are[0], [2, 3], [8]
with the total cost of 1.You can binary search the answer.
Assume the optimal answer is x. Now you should verify whether we can group the items into k groups where the maximum difference between the group items is at most x. This can be done in O(n) [after sorting the array]. Traverse the sorted array and pick consecutive items until the difference between minimum number you have picked for this group and the maximum number you have picked hasn't exceeded x. After that you should initialize a new group and repeat this process. At the end count how many groups you have made. If the number of groups is more than k we can conclude that we can not group the items in k groups with x being the answer. So we should increase x. By binary searching on x we can find the minimum x.
The overall complexity is O(NlogN).
Here is a sample implementation in C++