I am currently learning about Big O Notation running times and amortized times. I understand the notion of O(n) linear time, meaning that the size of the input affects the growth of the algorithm proportionally...and the same goes for, for example, quadratic time O(n2) etc..even algorithms, such as permutation generators, with O(n!) times, that grow by factorials.
For example, the following function is O(n) because the algorithm grows in proportion to its input n:
f(int n) {
int i;
for (i = 0; i < n; ++i)
printf("%d", i);
}
Similarly, if there was a nested loop, the time would be O(n2).
But what exactly is O(log n)? For example, what does it mean to say that the height of a complete binary tree is O(log n)?
I do know (maybe not in great detail) what Logarithm is, in the sense that: log10 100 = 2, but I cannot understand how to identify a function with a logarithmic time.
log x to base b = y
is the inverse ofb^y = x
If you have an M-ary tree of depth d and size n, then:
traversing the whole tree ~ O(M^d) = O(n)
Walking a single path in the tree ~ O(d) = O(log n to base M)
The logarithm
Ok let's try and fully understand what a logarithm actually is.
Imagine we have a rope and we have tied it to a horse. If the rope is directly tied to the horse, the force the horse would need to pull away (say, from a man) is directly 1.
Now imagine the rope is looped round a pole. The horse to get away will now have to pull many times harder. The amount of times will depend on the roughness of the rope and the size of the pole, but let's assume it will multiply one's strength by 10 (when the rope makes a complete turn).
Now if the rope is looped once, the horse will need to pull 10 times harder. If the human decides to make it really difficult for the horse, he may loop the rope again round a pole, increasing it's strength by an additional 10 times. A third loop will again increase the strength by a further 10 times.
We can see that for each loop, the value increases by 10. The number of turns required to get any number is called the logarithm of the number i.e. we need 3 posts to multiple your strength by 1000 times, 6 posts to multiply your strength by 1,000,000.
3 is the logarithm of 1,000, and 6 is the logarithm of 1,000,000 (base 10).
So what does O(log n) actually mean?
In our example above, our 'growth rate' is O(log n). For every additional loop, the force our rope can handle is 10 times more:
Now the example above did use base 10, but fortunately the base of the log is insignificant when we talk about big o notation.
Now let's imagine you are trying to guess a number between 1-100.
Now it took you 7 guesses to get this right. But what is the relationship here? What is the most amount of items that you can guess from each additional guess?
Using the graph, we can see that if we use a binary search to guess a number between 1-100 it will take us at most 7 attempts. If we had 128 numbers, we could also guess the number in 7 attemps but 129 numbers will takes us at most 8 attempts (in relations to logarithms, here we would need 7 guesses for a 128 value range, 10 guesses for a 1024 value range. 7 is he logarithm of 128, 10 is the logarithm of 1024 (base 2)).
Notice that I have bolded 'at most'. Big o notation always refers to the worse case. If you're lucky, you could guess the number in one attempt and so the best case is O(1), but that's another story.
What about O(n log n)?
You will eventually come across a linerarithmic time O(n log(n) algorithm. The rule of thumb above applies again, but this time the logarithmic function has to run n times e.g. reducing the size of a list n times, which occurs in algorithms like a mergesort.
You can easily identify if the algorithmic time is n log n. Look for an outer loop which iterates through a list (O(n)). Then look to see if there is an inner loop. If the inner loop is cutting/reducing the data set on each iteration, that loop is (O(log n), and so the overall algorithm is = O(n log n).
Disclaimer: The rope-logarithm example was grabbed from the excellent Mathematician's Delight book by W.Sawyer.
It simply means that the time needed for this task grows with log(n) (example : 2s for n = 10, 4s for n = 100, ...). Read the Wikipedia articles on Binary Search Algorithm and Big O Notation for more precisions.
I can add something interesting, that I read in book by Kormen and etc. a long time ago. Now, imagine a problem, where we have to find a solution in a problem space. This problem space should be finite.
Now, if you can prove, that at every iteration of your algorithm you cut off a fraction of this space, that is no less than some limit, this means that your algorithm is running in O(logN) time.
I should point out, that we are talking here about a relative fraction limit, not the absolute one. The binary search is a classical example. At each step we throw away 1/2 of the problem space. But binary search is not the only such example. Suppose, you proved somehow, that at each step you throw away at least 1/128 of problem space. That means, your program is still running at O(logN) time, although significantly slower than the binary search. This is a very good hint in analyzing of recursive algorithms. It often can be proved that at each step the recursion will not use several variants, and this leads to the cutoff of some fraction in problem space.
I can give an example for a for loop and maybe once grasped the concept maybe it will be simpler to understand in different contexts.
That means that in the loop the step grows exponentially. E.g.
The complexity in O-notation of this program is O(log(n)). Let's try to loop through it by hand (n being somewhere between 512 and 1023 (excluding 1024):
Although n is somewhere between 512 and 1023, only 10 iterations take place. This is because the step in the loop grows exponentially and thus takes only 10 iterations to reach the termination.
Now try to see it that way, if exponential grows very fast then logarithm grows (inversely) very slow.
The difference between O(n) and O(log(n)) is huge, similar to the difference between O(n) and O(a^n) (a being a constant).
Many good answers have already been posted to this question, but I believe we really are missing an important one - namely, the illustrated answer.
The following drawing depicts a binary tree. Notice how each level contains double the number of nodes compared to the level above (hence binary):
Binary search is an example with complexity
O(log n)
. Let's say that the nodes in the bottom level of the tree in figure 1 represents items in some sorted collection. Binary search is a divide-and-conquer algorithm, and the drawing shows how we will need (at most) 4 comparisons to find the record we are searching for in this 16 item dataset.Assume we had instead a dataset with 32 elements. Continue the drawing above to find that we will now need 5 comparisons to find what we are searching for, as the tree has only grown one level deeper when we multiplied the amount of data. As a result, the complexity of the algorithm can be described as a logarithmic order.
Plotting
log(n)
on a plain piece of paper, will result in a graph where the rise of the curve decelerates asn
increases: