Suppose I have two arrays and I want to merge them so that the merged array has the minimum amount of repetitions. For example [ 'x', 'x' ]
is a repetition.
arr1 = [ 'x', 'd', 'd', 'm', 'f', 'm' ]
arr2 = [ 'd', 'd', 'x', 'f', 'f', 'm' ]
The only condition is that in the merged array, the elements from arr1
and arr2
must appear in their respective orders within arr1
and arr2
. Below is an example of the merged array with 0 repetitions while maintaining this condition.
merged = [ 'd', 'x', 'd', 'x', 'd', 'f', 'd', 'm', 'f', 'm', 'f', 'm' ]
I'm trying to relate this problem to popular dynamic programming problems to help me out. Are there any similar problems out there that I should look into?
I define the following function: F(d, i, j) = the minimum number of repetitions possible from a string made of a prefix of arr1
of length i and prefix of arr2
of length j, followed by a ith (d=0) or jth (d=1) symbol from arr[d]
. Thus F(d, i, j) corresponds to a string of length i+j+1.
If you are familiar with the way Levenshtein distance is calculated, think of it as instead of assigning scores to the vertices of the grid we assign scores to the edges, where d
signifies whether it's the horizontal or the vertical edge. This gives us a one-symbol 'memory', so we can detect repetitions.
The following C++ code calculates the minimum number of repetitions and prints a corresponding string in quadratic time:
#include <string.h>
#include <stdio.h>
#include <algorithm>
#include <limits.h>
char A[32], B[32], C[64];
int score[2][32][32];
void print_result(int d, int i, int j)
{
char c = d ? B[j] : A[i];
int s0 = i > 0 ? score[0][i-1][j] + (A[i-1] == c) : INT_MAX;
int s1 = j > 0 ? score[1][i][j-1] + (B[j-1] == c) : INT_MAX;
if(s0 <= s1 && i > 0)
print_result(0, i-1, j);
else if(j > 0)
print_result(1, i, j-1);
printf("%c", c);
}
void print_result(int i, int j)
{
if(score[0][i-1][j] < score[1][i][j-1])
print_result(0, i-1, j);
else
print_result(1, i, j-1);
}
int main()
{
fgets(A, sizeof(A), stdin);
fgets(B, sizeof(B), stdin);
int m = strlen(A) - 1; // -1 to remove LF
int n = strlen(B) - 1;
for(int j = 0; j <= n; ++j)
{
for(int i = 0; i <= m; ++i)
{
score[0][i][j] = !i && !j ? 0 : std::min(
i > 0 ? score[0][i-1][j] + (A[i-1] == A[i]) : INT_MAX,
j > 0 ? score[1][i][j-1] + (B[j-1] == A[i]) : INT_MAX
);
score[1][i][j] = !i && !j ? 0 : std::min(
i > 0 ? score[0][i-1][j] + (A[i-1] == B[j]) : INT_MAX,
j > 0 ? score[1][i][j-1] + (B[j-1] == B[j]) : INT_MAX
);
}
}
printf("repetitions: %d\n", std::min(score[0][m-1][n], score[1][m][n-1]));
print_result(m, n);
printf("\n");
return 0;
}