I am trying to analysis time complexity of below function. This function is used to check if a string is made of other strings.
set<string> s; // s has been initialized and stores all the strings
bool fun(string word) {
int len = word.size();
// something else that can also return true or false with O(1) complexity
for (int i=1; i<=len; ++i) {
string prefix = word.substr(0,i);
string suffix = word.substr(i);
if (prefix in s && fun(suffix))
return true;
else
return false;
}
}
I think the time complexity is O(n)
where n is the length of word (am I right?). But as the recursion is inside the loop, I don't know how to prove it.
Edit:
This code is not a correct C++
code (e.g., prefix in s
). I just show the idea of this function, and want to know how to analysis its time complexity
The way to analyze this is by developing a recursion relationship based on the length of the input and the (unknown) probability that a prefix is in
s
. Let's assume that the probability of a prefix being ins
is given by some function pr(L) of the length L of the prefix. Let the complexity (number of operations) be given by T(len).If len == 0 (
word
is the empty string), then T = 1. (The function is missing a finalreturn
statement after the loop, but we're assuming that the actual code is only a sketch of the idea, not what's actually executing).For each loop iteration, denote the loop body complexity by T(len; i). If the prefix is not in
s
, then the body has constant complexity (T(len; i) = 1). This event has probability 1 - pr(i).If the prefix is in
s
, then the function returnstrue
orfalse
according to the recursive call tofun(suffix)
, which has complexity T(len - i). This event has probability pr(i).So for each value of
i
, the loop body complexity is:Finally (and this depends on the intended logic, not the posted code), we have
For simplicity, let's treat pr(i) as a constant function with value 0.5. Then the recursive relationship for T(len) is (up to a constant factor, which is unimportant for O() calculations):
As noted above, the boundary condition is T(0) = 1. This can be solved by standard recursive function methods. Let's look at the first few terms:
The pattern is clearly T(len) = 2 * T(len - 1) + 1. This corresponds to exponential complexity:
Of course, this result depends on the assumption we made about pr(i). (For instance, if pr(i) = 0 for all i, then T(n) = O(1). There would also be non-exponential growth if pr(i) had a maximum prefix length—pr(i) = 0 for all i > M for some M.) The assumption that pr(i) is independent of i is probably unrealistic, but this really depends on how
s
is populated.Assuming that you've fixed the bugs others have noted, then the
i
values are the places that the string is being split (eachi
is the leftmost splitpoint, and then you recurse on everything to the right ofi
). This means that if you were to unwind the recursion, you are looking at up ton-1
different split points, and asking if each substring is a valid word. Things are ok if the beginning ofword
doesn't have a lot of elements from your set, since then you can skip the recursion. But in the worst case,prefix in s
is always true, and you try every possible subset of then-1
split points. This gives2^{n-1}
different splitting sets, multiplied by the length of each such set.