Longest common prefix length of all substrings and

2020-03-30 06:18发布

问题:

I found similar questions on StackOverflow, but my question is different.

Given a string s contains lowercase alphabet. I want to find the length of Longest common Prefix of all substrings.

For example

s = 'ababac'

Then substrings are as follow:

1: s(1, 6) = ababac
2: s(2, 6) = babac
3: s(3, 6) = abac
4: s(4, 6) = bac
5: s(5, 6) = ac
6: s(6, 6) = c

Now, The lengths of LCP of all substrings are as follow

1: len(LCP(s(1, 6), s)) = 6 
2: len(LCP(s(2, 6), s)) = 0
3: len(LCP(s(3, 6), s)) = 3
4: len(LCP(s(4, 6), s)) = 0
5: len(LCP(s(5, 6), s)) = 1
6: len(LCP(s(6, 6), s)) = 0

I am using character by character matching

    string commonPrefix(string s1, string s2) { 
        int minlen = minlength1(s1, s2); 
        char current; 
        int result = 0;
        for (int i=0; i<minlen; i++) { 
            current = s1[i]; 
            for (int j=1 ; j<n; j++) 
                if (s2[i] != current) 
                return result; 
            result++;
        } 

        return result; 
    }

But still, it's O(n2). I know all substrings are overlapping on one another, It can be optimized further. Can anyone help to optimize this code?

回答1:

This is similar to Z-algorithm for pattern matching. Except for the first case where len(LCP(s(1, 6), s)) = len (s).

We need to create a Z array . For a string str[0..n-1], Z array is of same length as string. An element Z[i] of Z array stores length of the longest substring starting from str[i] which is also a prefix of str[0..n-1]. The first entry of Z array is meaning less as complete string is always prefix of itself.

Visualize the algorithm here : https://personal.utdallas.edu/~besp/demo/John2010/z-algorithm.htm

Below is the solution of the same :

public static int[] computeZ(String s) {
    int l = 0; r = 0;
    int [] Z = new int[len];
    int len = s.length();
    for (int k =0 ; k < len; k++ ) {
        int j;
        if (k < r) {
            j = (z[k-l] < (r-k)) ? z[k-l] : (r-k)
        } else {
            j = 0;
        }
        while (k + j < len) {
            if (s.charAt(k+j) == s.charAt(j)) {
                j++;
            } else {
                break;
            }
        }
        if (k + j > r) {
            l = k;
            r = k + j;
        }
    }
    Z[0] = len;
    return Z;
}


回答2:

As mentioned by Aditya, this can be solved using Z-Algorithm. Please find the detailed explanation with implementation here - https://www.hackerearth.com/practice/algorithms/string-algorithm/z-algorithm/tutorial/