Partition an array into K subarrays with minimal d

2020-03-04 03:43发布

问题:

DISCLAIMER:

Described problem looks like a task from a competition. I'm not participating in any of them, I'm not aware about any ongoing competitions, which might involve the problem. If there are any of them, I'll close the question to stay fair!

I have a problem: given an array A of values and integer K, split A into exactly K non-overlapping contiguous subarrays in such way that difference between a subarray with minimal and a subarray maximum sums is minimal. It is allowed to rotate A by any number in any direction.

Consider an example:

Input: A = [5 1 1 1 3 2], K = 3

Output: [5][1 1 1][3 2], maximum sum = 5, minimum sum = 3, result = 2

I have partially working code (terribly ugly, my bad, but it does not meant to be production quality):

#include <climits>
#include <cstdio>
#include <cstring>

const int max_n = 50;
const int max_k = 20;

int deps[max_n];

int max (int x, int y) {
  return x > y ? x : y;
}

int min (int x, int y) {
  return x < y ? x : y;
}

int sum (int a[], int start, int end) {
  int res = 0;
  for (int i = start; i <= end; ++i) res += a[i];

  return res;
}

int k_partitioning(int k, int n, int deps[]) {
  int res = INT_MAX;
  // consider all possible rotations/shifts
  for(int offset = 0; offset < n; ++offset) {
    for(int l_min = 0; l_min < n; ++l_min) {
      for(int r_min = l_min; r_min < n; ++r_min) {
        // check minimal sum subarray
        int min_sum = sum (deps, l_min, r_min);

        int dp[k][n];
        for (int s = 0; s < k; ++s) {
          for (int q = 0; q < n; ++q) {
            dp[s][q] = 0;
          }
        }
        // assuming that current sum is a target sum
        dp[0][r_min-l_min] = min_sum;

        for(int p = 1; p < k; ++p) {
          for(int l_max = 0; l_max < n; ++l_max) {
            for(int r_max = 0; r_max < n; ++r_max) {
              int max_sum = sum(deps, l_max, r_max);

              if (max_sum >= min_sum) dp[p][r_max] = max(dp[p-1][l_max], max_sum);
            } // l_maxs
          } // r_maxs
        } // partitions
        // printing dp

        // skip incorrect partitioning, when not all K partitions were used
        if (dp[k-1][n-1] == 0) continue;

        // update difference
        res = min (res, dp[k-1][n-1] - min_sum);
      } // end min sum seg
    } // start min sum seg
    //break;
  } // cuts
  return res;
}

int main(int argc, char* argv[]) {
  int k = 0;
  scanf("%d", &k);

  int n = 0;
  scanf("%d", &n);

  for (int i = 0; i < n; ++i) {
    scanf("%d", &deps[i]);
  }

  printf ("%d\n", k_partitioning(k, n, deps));

  return 0;
}

The idea is simple: assume that current partition has minimal sum, enumerate all possible maximal partitions, setup dynamic programming for generating maximum sum with minimal value, check for difference. Total complexity: O(K*N^4).

My problem is that it fails some tests and I'm stuck with troubleshooting it. Could someone help me with it?

Failed test, for example:

N = 4, K = 2, A = [6 13 10 2]

UPDATE

This version should fix some previous issues. First, it removes wasteful loop over "offsets" and adds just an array rotation in the end of l_min loop. Second, I've noticed, that dp can't be initialized with 0 - this is minimization task, so it should be initialized with some large value (depends on a problem's constants, max_value here already is out of value domain). Finally, intervals should not overlap anymore - each sum exclude left end of an interval. However, it still does not produce expected results.

#include <climits>
#include <cstdio>
#include <cstring>

const int max_value = 200000;
const int max_n = 50;
const int max_k = 20;

int deps[max_n];

int max (int x, int y) {
  return x > y ? x : y;
}

int min (int x, int y) {
  return x < y ? x : y;
}

int sum (int a[], int start, int end) {
  int res = 0;
  for (int i = start; i <= end; ++i) res += a[i];

  return res;
}

int k_partitioning(int k, int n, int deps[]) {
  int res = max_value;

  for(int l_min = 0; l_min < n; ++l_min) {
    for(int r_min = l_min; r_min < n; ++r_min) {
      int min_sum = sum (deps, l_min+1, r_min);

      int dp[k][n];
      for (int s = 0; s < k; ++s) {
        for (int q = 0; q < n; ++q) {
          dp[s][q] = max_value;
        }
      }
      // assuming that current sum is a target sum
      dp[0][r_min-l_min] = min_sum;

      for(int p = 1; p < k; ++p) {
        for(int l_max = 0; l_max < n; ++l_max) {
          for(int r_max = l_max; r_max < n; ++r_max) {
            int max_sum = sum(deps, l_max+1, r_max);

            if (max_sum >= min_sum) dp[p][r_max] = max(dp[p-1][l_max], max_sum);
          } // l_maxs
        } // r_maxs
      } // partitions

      // skip incorrect partitioning, when not all K partitions were used
      if (dp[k-1][n-1] == max_value) continue;

      // update difference
      res = min (res, dp[k-1][n-1] - min_sum);
    } // end min sum seg

    // rotate an array to consider different starting points
    int tmp[n];
    for (int i = 0; i < n; ++i) {
      int new_idx = i + n + 1;

      tmp[new_idx % n] = deps[i];
    }

    for(int i = 0; i < n; ++i) deps[i] = tmp[i];
  } // start min sum seg

  return res;
}

int main(int argc, char* argv[]) {
  int k = 0;
  scanf("%d", &k);

  int n = 0;
  scanf("%d", &n);

  for (int i = 0; i < n; ++i) {
    scanf("%d", &deps[i]);
  }

  printf ("%d\n", k_partitioning(k, n, deps));

  return 0;
}

回答1:

Ok, I think I did it!

The idea is following: we assume that minimum sum interval always starts from 0. Then we start to enumerate maximum sum intervals, starting from the right boundary of the minimal interval. We build DP problem for current max interval to determine a minimum maximal sum. After that you update result and rotate an array by one.

My code is not perfect in a way that I compute current sums each iteration. One can pre-compute them and just index them each time.

This code might have some bugs, but it passes all test that I have.

#include <climits>
#include <cstdio>
#include <cstring>

const int max_value = 200000;
const int max_n = 50;
const int max_k = 20;

int deps[max_n];

int max (int x, int y) {
  return x > y ? x : y;
}

int min (int x, int y) {
  return x < y ? x : y;
}

int sum (int a[], int start, int end) {
  int res = 0;

  for (int i = start; i <= end; ++i) res += a[i];

  return res;
}

int k_partitioning(int k, int n, int deps[]) {
  int res = max_value;
  for(int offset = 0; offset < n; ++offset) {
    int l_min = 0;
    for(int r_min = l_min; r_min < n; ++r_min) {
      int min_sum = sum (deps, l_min, r_min);

      int dp[k][n];
      for (int s = 0; s < k; ++s) {
        for (int q = 0; q < n; ++q) {
          dp[s][q] = max_value;
        }
      }
      // assuming that current sum is a target sum
      dp[0][r_min-l_min] = min_sum;

      for(int p = 1; p < k; ++p) {
        for(int l_max = r_min; l_max < n; ++l_max) {
          for(int r_max = l_max; r_max < n; ++r_max) {
            int max_sum = sum(deps, l_max+1, r_max);

            if (max_sum >= min_sum) {
              dp[p][r_max] = min(dp[p][r_max], max(dp[p-1][l_max], max_sum));
            }

          } // l_maxs
        } // r_maxs
      } // partitions

      // skip incorrect partitioning, when not all K partitions were used
      if (dp[k-1][n-1] == max_value) continue;

      // update difference
      res = min (res, dp[k-1][n-1] - min_sum);
    } // end min sum seg
    int tmp[n];
    for (int i = 0; i < n; ++i) {
      int new_idx = i + n - 1;

      tmp[new_idx % n] = deps[i];
    }

    for(int i = 0; i < n; ++i) deps[i] = tmp[i];

  } // start min sum seg
  return res;
}

int main(int argc, char* argv[]) {
  int k = 0;
  scanf("%d", &k);

  int n = 0;
  scanf("%d", &n);

  for (int i = 0; i < n; ++i) {
    scanf("%d", &deps[i]);
  }

  printf ("%d\n", k_partitioning(k, n, deps));

  return 0;
}


回答2:

Solution without rotations:

  • 1) Compute max M and total S of the array - O(n)
  • 2) Let there be a function F(P), which returns True if it is possible to get a Sum P or less with k (>= 0) partitions still remaining.
  • 3) Do a binary search on range(M, S) using F. - O(log(S-M))
  • 4) Logic behind F: Fill a bucket till it's not greater than S/K. Then move onto next bucket. If there are still items remaining and no buckets remaining, then the answer is false - O(n)

    Time Complexity = O(n) + O(n) * (log(S-M)) = O(n*log(S-M))

Solution with Rotations:

For all rotations in [0, 1, ... N-1], compute min sum.

Total Time Complexity = O(n) * O(nlog(S-M)) = O(n^2*log(S-M))



回答3:

Now that you've got your code working, here's an alternative method :)

Consider that for each k, we can pair a sum growing from A[i] to the left (sum A[i-j..i]) with all available intervals recorded for f(k-1, i-j-1) and update them - for each interval, (low, high), if the sum is greater than high, then new_interval = (low, sum) and if the sum is lower than low, then new_interval = (sum, high); otherwise, the interval stays the same. For example,

i:  0 1 2 3 4 5
A: [5 1 1 1 3 2]

k = 3
i = 3, j = 0
The ordered intervals available for f(3-1, 3-0-1) = f(2,2) are:
  (2,5), (1,6) // These were the sums, (A[1..2], A[0]) and (A[2], A[0..1])
Sum = A[3..3-0] = 1
Update intervals: (2,5) -> (1,5)
                  (1,6) -> (1,6) no change

Now, we can make this iteration much more efficient by recognizing and pruning intervals during the previous k round.

Watch:

A: [5 1 1 1 3 2]

K = 1:

  N = 0..5; Intervals: (5,5), (6,6), (7,7), (8,8), (11,11), (13,13)

K = 2:

  N = 0: Intervals: N/A

  N = 1: Intervals: (1,5)

  N = 2: (1,6), (2,5)

    Prune: remove (1,6) since any sum <= 1 would be better paired with (2,5)
           and any sum >= 6 would be better paired with (2,5)

  N = 3: (1,7), (2,6), (3,5)

    Prune: remove (2,6) and (1,7)

  N = 4: (3,8), (4,7), (5,6), (5,6)

    Prune: remove (3,8) and (4,7)

  N = 5: (2,11), (5,8), (6,7)

    Prune: remove (2,11) and (5,8)

For k = 2, we are now left with the following pruned record:

{
  k: 2,
  n: {
    1: (1,5),
    2: (2,5),
    3: (3,5),
    4: (5,6),
    5: (6,7)
  }
}

We've cut down the iteration of k = 3 from a list of n choose 2 possible splits to n relevant splits!

The general algorithm applied to k = 3:

for k' = 1 to k
  for sum A[i-j..i], for i <- [k'-1..n], j <- [0..i-k'+1]:
    for interval in record[k'-1][i-j-1]: // records are for [k'][n']
      update interval
  prune intervals in k'

k' = 3
  i = 2
    sum = 1, record[2][1] = (1,5) -> no change

  i = 3
    // sums are accumulating right to left starting from A[i]
    sum = 1, record[2][2] = (2,5) -> (1,5)
    sum = 2, record[2][1] = (1,5) -> no change

  i = 4
    sum = 3, record[2][3] = (3,5) -> no change
    sum = 4, record[2][2] = (2,5) -> no change
    sum = 5, record[2][1] = (1,5) -> no change

  i = 5
    sum = 2, record[2][4] = (5,6) -> (2,6)
    sum = 5, record[2][3] = (3,5) -> no change
    sum = 6, record[2][2] = (2,5) -> (2,6)
    sum = 7, record[2][1] = (1,5) -> (1,7)

The answer is 5 paired with record[2][3] = (3,5), yielding the updated interval, (3,5). I'll leave the pruning logic for the reader to work out. If we wanted to continue, here's the pruned list for k = 3

{
  k: 3
  n: {
    2: (1,5), 
    3: (1,5),
    4: (3,5),
    5: (3,5)
  }
}