I found similar questions on StackOverflow, but my question is different.
Given a string s
contains lowercase alphabet
. I want to find the length of Longest common Prefix
of all substrings.
For example
s = 'ababac'
Then substrings are as follow:
1: s(1, 6) = ababac
2: s(2, 6) = babac
3: s(3, 6) = abac
4: s(4, 6) = bac
5: s(5, 6) = ac
6: s(6, 6) = c
Now, The lengths of LCP
of all substrings are as follow
1: len(LCP(s(1, 6), s)) = 6
2: len(LCP(s(2, 6), s)) = 0
3: len(LCP(s(3, 6), s)) = 3
4: len(LCP(s(4, 6), s)) = 0
5: len(LCP(s(5, 6), s)) = 1
6: len(LCP(s(6, 6), s)) = 0
I am using character by character matching
string commonPrefix(string s1, string s2) {
int minlen = minlength1(s1, s2);
char current;
int result = 0;
for (int i=0; i<minlen; i++) {
current = s1[i];
for (int j=1 ; j<n; j++)
if (s2[i] != current)
return result;
result++;
}
return result;
}
But still, it's O(n2). I know all substrings are overlapping on one another, It can be optimized further. Can anyone help to optimize this code?
This is similar to Z-algorithm for pattern matching. Except for the first case where len(LCP(s(1, 6), s)) = len (s).
We need to create a Z array . For a string str[0..n-1], Z array is of same length as string. An element Z[i] of Z array stores length of the longest substring starting from str[i] which is also a prefix of str[0..n-1]. The first entry of Z array is meaning less as complete string is always prefix of itself.
Visualize the algorithm here : https://personal.utdallas.edu/~besp/demo/John2010/z-algorithm.htm
Below is the solution of the same :
As mentioned by Aditya, this can be solved using Z-Algorithm. Please find the detailed explanation with implementation here - https://www.hackerearth.com/practice/algorithms/string-algorithm/z-algorithm/tutorial/