Time complexity analysis of function with recursio

2019-09-07 00:18发布

I am trying to analysis time complexity of below function. This function is used to check if a string is made of other strings.

set<string> s; // s has been initialized and stores all the strings
bool fun(string word) {
    int len = word.size();

    // something else that can also return true or false with O(1) complexity

    for (int i=1; i<=len; ++i) {
       string prefix = word.substr(0,i);
       string suffix = word.substr(i);
       if (prefix in s && fun(suffix))
           return true;
       else
           return false;
    }
}

I think the time complexity is O(n) where n is the length of word (am I right?). But as the recursion is inside the loop, I don't know how to prove it.

Edit:

This code is not a correct C++ code (e.g., prefix in s). I just show the idea of this function, and want to know how to analysis its time complexity

2条回答
Anthone
2楼-- · 2019-09-07 00:45

The way to analyze this is by developing a recursion relationship based on the length of the input and the (unknown) probability that a prefix is in s. Let's assume that the probability of a prefix being in s is given by some function pr(L) of the length L of the prefix. Let the complexity (number of operations) be given by T(len).

If len == 0 (word is the empty string), then T = 1. (The function is missing a final return statement after the loop, but we're assuming that the actual code is only a sketch of the idea, not what's actually executing).

For each loop iteration, denote the loop body complexity by T(len; i). If the prefix is not in s, then the body has constant complexity (T(len; i) = 1). This event has probability 1 - pr(i).

If the prefix is in s, then the function returns true or false according to the recursive call to fun(suffix), which has complexity T(len - i). This event has probability pr(i).

So for each value of i, the loop body complexity is:

T(len; i) = 1 * (1 - pr(i)) + T(len - i) * pr(i)

Finally (and this depends on the intended logic, not the posted code), we have

T(len) = sum i=1...len(T(len; i))

For simplicity, let's treat pr(i) as a constant function with value 0.5. Then the recursive relationship for T(len) is (up to a constant factor, which is unimportant for O() calculations):

T(len) = sum i=1...len(1 + T(len - i)) = len + sum i=0...len-1(T(i))

As noted above, the boundary condition is T(0) = 1. This can be solved by standard recursive function methods. Let's look at the first few terms:

len   T(len)
0     1
1     1 + 1 = 2
2     2 + 2 + 1 = 5
3     3 + (4 + 2 + 1) = 11
4     4 + (11 + 5 + 2 + 1) = 23
5     5 + (23 + 11 + 5 + 2 + 1) = 47

The pattern is clearly T(len) = 2 * T(len - 1) + 1. This corresponds to exponential complexity:

T(n) = O(2n)

Of course, this result depends on the assumption we made about pr(i). (For instance, if pr(i) = 0 for all i, then T(n) = O(1). There would also be non-exponential growth if pr(i) had a maximum prefix length—pr(i) = 0 for all i > M for some M.) The assumption that pr(i) is independent of i is probably unrealistic, but this really depends on how s is populated.

查看更多
Juvenile、少年°
3楼-- · 2019-09-07 01:08

Assuming that you've fixed the bugs others have noted, then the i values are the places that the string is being split (each i is the leftmost splitpoint, and then you recurse on everything to the right of i). This means that if you were to unwind the recursion, you are looking at up to n-1 different split points, and asking if each substring is a valid word. Things are ok if the beginning of word doesn't have a lot of elements from your set, since then you can skip the recursion. But in the worst case, prefix in s is always true, and you try every possible subset of the n-1 split points. This gives 2^{n-1} different splitting sets, multiplied by the length of each such set.

查看更多
登录 后发表回答