In What exactly does big Ө notation represent?, the most upvoted answer contains the following statement:
For example, merge sort worst case is both O(n*log(n))
and Omega(n*log(n))
- and thus is also Ө(n*log(n))
, but it is also O(n^2)
, since n^2
is asymptotically "bigger" than it. However, it is not Ө(n^2)
, Since the algorithm is not Omega(n^2)
.
I have two questions:
- How do you determine that the worst case is
O(n*log(n))
and Omega(n*log(n))
. In "Introduction to Algorithms" you determine the Theta
time of insertion sort by counting number of times and cost of the execution of each statement. Is there a similar method for O
and Omega
?
- How can merge sort's worst-case be
O(n*log(n))
but also O(n^2)
?
First thing, the link in your question does not point to the accepted answer. It points to the most upvoted answer.
Now, answering your questions one by one...
How do you determine that the worst case is O(nLog(n)) and Ω(nLog(n)). In "Introduction to Algorithms" you determine the Theta time of insertion sort by counting number of times and cost of the execution of each statement. Is there a similar method for O and Omega?
For the sake of making this answer useful to everyone, I'll mention again the important points related to Merge Sort.
- Merge Sort is a divide and conquer algorithm.
- You take a sequence of numbers, you break it in half recursively, you sort the halves individually and finally you merge the halves up recursively again leading to a sorted sequence of numbers.
Take the most simplistic case. How many elements do you touch† if you run the Merge Sort on that case? It is easy to visualize with the help of the image below. The elements touched at each level are cn and there are a total of lg n levels. So total elements touched, even in the easiest case, is cnLog(n).
Image Source
Thus, Merge Sort will always have a minimum complexity ≈ cnLog(n) and that's also written as Merge Sort is Ω(nLog(n)).
Ok. So we understand that even in the most simplistic run of this algorithm we have to touch the elements of the sequence a total of cnLog(n) times. Now...
How many times do we touch the elements of the sequence otherwise? Well, in any case whatsoever, the recursion tree of Merge Sort will be exactly the same as above. So, it turns out that we will always have a run-time complexity of Merge Sort proportional to nLog(n)! This means that both, the upper as well as the lower bound of Merge Sort is cnLog(n).
Hence, Merge Sort's run time complexity is not only Ω(nLog(n)) but also O(nLog(n)) which means that it is Ө(nLog(n)).
You wanted to see (aka be able to visualize) the calculations done for Merge Sort. I hope the graphical explanation which shows the calculations done at each level of the tree are satisfactory.
How can merge sort's worst-case be O(n*log(n)) but also O(n^2)?
That's straightforward, but just like the previous answer I'll begin with the basic definition so that the answer is useful to everyone.
I like the image below for understanding the terminologies.
Image Source
Pay attention that Big-O only stresses on bounding above. It does not bother about how far above. So an algorithm has infinite functions as its Big-O complexity. The most informative of them is the one that is immediately above, otherwise every algo is O(infinity).
Now with the definition being clear, we can easily say that Merge Sort is O(nLog(n)) as well as O(n2).
†'Touch' word in this explanation is the equivalent of what you have written in the question as "counting [the] number of times". During the run of the algorithm, every time a number is encountered - we say that the number is touched. So effectively we count the touches of a number to determine the complexity of the algorithm.
How can merge sort's worst-case be O(n*log(n))
but also O(n^2)
?
Because the notation f(n) = O(g(n))
means that f(n)
is asymptotically smaller-than-or-equal-to g(n)
. Perhaps a less misleading notation would be f(n) <= O(g(n))
. Since n ^ 2
is asymptotically greater than or equal to n * log n
, the above inequality holds.
As to your first question: determining an average running time is usually a lot trickier and more involved than determining an upper bound on the worst case. Anyway, here's a nice explanation I've found for you.
Answer to second question:
When we say 'merge sort runs in O(x)
' this is equivalent to saying 'merge sort has an asymptotic runtime behavior proportional to at most x
'. In other words: O(x)
stands for an upper bound. Note that this is independent of worst-case vs. best-case. You can give an upper bound for a best-case scenario, too.
To summarize: An algorithm can simultaneously have O(n log n)
, O(n^2)
etc, just as the number 3 is smaller than 5 and 10, simultaneously.
Answer to first question:
Most proofs for theta are actually composed of two separate proofs -- one for the lower bound Omega(x)
and the other for the upper bound O(x)
. The lower bound measure is applied to the problem: you are stating that the problem cannot be solved in less than x
(asymptotically). This usually requires a different proof strategy than the upper bound, where you prove that a given algorithm will actually solve the problem and not exceed the time given by x
(asymptotically).