Minimum Lexicographic Rotation Using Suffix Array

2020-03-04 02:40发布

问题:

    Consider a string of length n (1 <= n <= 100000). 
    Determine its minimum lexicographic rotation. 
    For example, the rotations of the string “alabala” are:

    alabala

    labalaa

    abalaal

    balaala

    alaalab

    laalaba

    aalabal

    and the smallest among them is “aalabal”.

This is the problem from ACM ICPC 2003 .This problem has already been asked in stack flow by some other user.[But that wasn't useful as , I want to do it by suffix Array.]

How to do this problem using the Suffix Array?

Till Now what I had done??

(1) Lets say the given string is S.

I concatenated the string S with itself to get a string S'.

ie. S'=S+S.

(2).Then I found the suffix array of S' in O(nlog n )time.

For example:
    S=alabala
    S'=alabalaalabala

Suffix No. Index    Suffixes

0       13      a
1       6       aalabala
2       9       abala
3       2       abalaalabala
4       11      ala
5       4       alaalabala
6       7       alabala
7       0       alabalaalabala
8       10      bala
9       3       balaalabala
10      12      la
11      5       laalabala
12      8       labala
13      1       labalaalabala

So I calculated the suffix-array SA well ,SA[]={13,6,9,2,11,4,7,0,10,3,12,5,8,1}.

Also I calculated the LCPs b/w every suffixes [Although i am not confident that I will require it in this problem].

Now How to proceed further.How to use SA to get a desired result?

Explanation with a very *small example will be quite effective.

Thanks!!

回答1:

It seems that you should take first suffix in SA, which index is between 0 and length(S) - 1.

Some explanation: all rotations of S are in the beginning of S' suffixes from positions between 0 and length(S) - 1. Suffix array keeps suffixes in lexicographical order, so you just need to pick the first one which begins from rotation of S.



回答2:

If you are using O(n log n) algorithm (sort by first letter, then by first two letters, then by first four, ...), you can make a little bit modified suffix array.

Don't sort suffixes of string but it's cyclic rotations. It should be really small modification in the algorithm. A then you will get desired result directly.

If you still want to use your method, then just take first index which is between 0 and N.



回答3:

Thanks all .Both the answer by vkorchagin and usamec is correct for most Test Cases ,but they won't work for the Following Test Case(S="baabaa")

S=baabaa; S'=baabaabaabaa;

Suffix| Suffix  |  Suffixes
Index | Length  |

11      1       a
10      2       aa
7       5       aabaa
4       8       aabaabaa
1       11      aabaabaabaa
8       4       abaa
5       7       abaabaa
2       10      abaabaabaa
9       3       baa
6       6       baabaa
3       9       baabaabaa
0       12      baabaabaabaa

Taking the First Suffix whose Index is between 0 to S.length()-1 does not work for the above test case.If I does so ,then the result is 4 ,but the correct answer is 1.

So I modified the answer ,a bit .

This is what i did or added/modified an extra condition to the above answers ::

(1)I took the First Suffix whose Index is between 0 to S.length()-1 .

Lets say its index is :=ExpectedIdx.

In the above Example ExpectedIdx=4.

(2).Now the ExpectedIdx may or may not be the answer. The reason is the Next suffix in Suffix Array may produce the same answer.

Example ::

Taking the suffix whose starting Index is 4 (ExpectedIdx),aabaabaa.,we get aabaab as Minimum Lexograhic rotated String.

Taking the Next suffix , aabaabaabaa.

We also get aabaab as Minimum Lexograhic rotated String.

But the former one requires the shift of 4 while the latter one requires the shift of 1 .So correct answer is 1 ,not 4.

So i used the concept of Longest Common Prefix (LCP) to check the similarities and Finally got accepted.http://livearchive.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=756

Edit:: This is the Pseudocode -

int ExpectedIdx,ExpectedSuffixNumber,ExpectedSuffixLength;
for(int i=0;i<strlen(str);++i)//str = Length of S'
{
    suffixsize=strlen(str)-SA[i];
    if(suffixsize>(Len/2))//Len/2:=Size of S
    {
        ExpectedIdx=SA[i];
        ExpectedSuffixNumber=i;
        ExpectedSuffixLength=suffixsize;
        break;
    }
}
//Now this ExpectediDx may or may not be the correct answer.

int finalans=ExpectedIdx;//Lets assume initially that ExpectedIdx is a correct/final answer.
for(int i=(ExpectedSuffixNumber+1);i<Len;++i)//Check the Next Suffix 
{
    if(LCP[i]>Len/2)//LCP[i]=Lingest common prefix of adjacent prefixes in a suffix Array.
    {
        if(SA[i]>finalans)
        {
            finalans=SA[i];
        }
    }
    else
        break;
}