I am watching the Berkley Uni online lecture and stuck on the below.
Problem: Assume you have a collection of CD that is already sorted. You want to find the list of CD with whose title starts with "Best Of."
Solution: We will use binary search to find the first case of "Best Of" and then we print until the tile is no longer "Best Of"
Additional question: Find the complexity of this Algorithm.
Upper Bound: Binary Search Upper Bound is O(log n), so once we have found it then we print let say k title. so it is O(logn + k)
Lower Bound: Binary Search lower Bound is Omega(1) assuming we are lucky and the record title is the middle title. In this case it is Omega(k)
This is the way I analyzed it.
But in the lecture, the lecturer used best case and worst case.
I have two questions about it:
- Why need to use best case and worst case, aren't big-O and Omega considered as the best and worst cases the algorithm can perform?
His analysis was
Worst Case : Theta(logn + k)
Best Case : Theta (k)
If I use the concept of Worst Case as referring to the data and having nothing to do with algorithm then yep, his analysis is right.
This is because assuming the worst case (CD title in the end or not found) then the Big O and Omega is both log n there it is theta(log n +k).
Assuming you do not do "best case" and "worst case", then how do you analyze the algorithm? Is my analysis right?
Why need to use best case and worst case, aren't big-O and Omega considered as the best and worst cases the algorithm can perform?
No, the Ο and Ω notations do only describe the bounds of a function that describes the asymptotic behavior of the actual behavior of the algorithm. Here’s a good
- Ω describes the lower bound: f(n) ∈ Ω(g(n)) means the asymptotic behavior of f(n) is not less than g(n)·k for some positive k, so f(n) is always at least as much as g(n)·k.
- Ο describes the upper bound: f(n) ∈ Ο(g(n)) means the asymptotic behavior of f(n) is not more than g(n)·k for some positive k, so f(n) is always at most as much as g(n)·k.
These two can be applied on both the best case and the worst case for binary search:
- best case: first element you look at is the one you are looking for
- Ω(1): you need at least one lookup
- Ο(1): you need at most one lookup
- worst case: element is not present
- Ω(log n): you need at least log n steps until you can say that the element you are looking for is not present
- Ο(log n): you need at most log n steps until you can say that the element you are looking for is not present
You see, the Ω and Ο values are identical. In that case you can say the tight bound for the best case is Θ(1) and for the worst case is Θ(log n).
But often we do only want to know the upper bound or tight bound as the lower bound has not much practical information.
Assuming you do not do "best case" and "worst case", then how do you analyze the algorithm? Is my analysis right?
Yes, your analysis seems correct.
For your first question, there is a difference between the best-case runtime of an algorithm, the worst-case runtime of an algorithm, big-O, and big-Ω notations. The best- and worst-case runtimes of an algorithm are actual mathematical functions with precise values that tell you the maximum and minimum amount of work an algorithm will do. To describe the growth rates of these functions, we can use big-O and big-Ω notation. However, it's possible to describe the best-case behavior of a function with big-Ω notation or the worst-case with big-O notation. For example, we might know that the worst-case runtime of a function might be O(n2), but not actually know which function that's O(n2) the worst-case behavior is. This might occur if we wanted to upper-bound the worst-case behavior so that we know that it isn't catastrophically bad. It's possible that in this case the worst-case behavior is actually Θ(n), in which case the O(n2) is a bad upper bound, but saying that the worst-case behavior is O(n2) in this case indicates that it isn't terrible. Similarly, we might say that the best-case behavior of an algorithm is Ω(n), for example, if we know that it must do at least linear work but can't tell if it must do much more than this.
As to your second question, the analysis of the worst-case and best-case behaviors of the algorithm are absolutely dependent on the structure of the input data, and it's usually quite difficult to analyze the best- and worst-case behavior of an algorithm without seeing how it performs on different data sets. It's perfectly reasonable to do a worst-case analysis by exhibiting some specific family of inputs (note that it has to be a family of inputs, not just one input, since we need to get an asymptotic bound) and showing that the algorithm must run poorly on that input. You can do a best-case analysis in the same way.
Hope this helps!