Still sort of confused about Big O notation

2019-03-19 10:18发布

问题:

So I've been trying to understand Big O notation as well as I can, but there are still some things I'm confused about. So I keep reading that if something is O(n), it usually is referring to the worst-case of an algorithm, but that it doesn't necessarily have to refer to the worst case scenario, which is why we can say the best-case of insertion sort for example is O(n). However, I can't really make sense of what that means. I know that if the worst-case is O(n^2), it means that the function that represents the algorithm in its worst case grows no faster than n^2 (there is an upper bound). But if you have O(n) as the best case, how should I read that as? In the best case, the algorithm grows no faster than n? What I picture is a graph with n as the upper bound, like

If the best case scenario of an algorithm is O(n), then n is the upper bound of how fast the operations of the algorithm grow in the best case, so they cannot grow faster than n...but wouldn't that mean that they can grow as fast as O(log n) or O(1), since they are below the upper bound? That wouldn't make sense though, because O(log n) or O(1) is a better scenario than O(n), so O(n) WOULDN'T be the best case? I'm so lost lol

回答1:

Big-O, Big-Θ, Big-Ω are independent from worst-case, average-case, and best-case.

The notation f(n) = O(g(n)) means f(n) grows no more quickly than some constant multiple of g(n).
The notation f(n) = Ω(g(n)) means f(n) grows no more slowly than some constant multiple of g(n).
The notation f(n) = Θ(g(n)) means both of the above are true.

Note that f(n) here may represent the best-case, worst-case, or "average"-case running time of a program with input size n.
Furthermore, "average" can have many meanings: it can mean the average input or the average input size ("expected" time), or it can mean in the long run (amortized time), or both, or something else.

Often, people are interested in the worst-case running time of a program, amortized over the running time of the entire program (so if something costs n initially but only costs 1 time for the next n elements, it averages out to a cost of 2 per element). The most useful thing to measure here is the least upper bound on the worst-case time; so, typically, when you see someone asking for the Big-O of a program, this is what they're looking for.

Similarly, to prove a problem is inherently difficult, people might try to show that the worst-case (or perhaps average-case) running time is at least a certain amount (for example, exponential).
You'd use Big-Ω notation for these, because you're looking for lower bounds on these.

However, there is no special relationship between worst-case and Big-O, or best-case and Big-Ω.
Both can be used for either, it's just that one of them is more typical than the other.

So, upper-bounding the best case isn't terribly useful. Yes, if the algorithm always takes O(n) time, then you can say it's O(n) in the best case, as well as on average, as well as the worst case. That's a perfectly fine statement, except the best case is usually very trivial and hence not interesting in itself.

Furthermore, note that f(n) = n = O(n2) -- this is technically correct, because f grows more slowly than n2, but it is not useful because it is not the least upper bound -- there's a very obvious upper bound that's more useful than this one, namely O(n). So yes, you're perfectly welcome to say the best/worst/average-case running time of a program is O(n!). That's mathematically perfectly correct. It's just useless, because when people ask for Big-O they're interested in the least upper bound, not just a random upper bound.

It's also worth noting that it may simply be insufficient to describe the running-time of a program as f(n). The running time often depends on the input itself, not just its size. For example, it may be that even queries are trivially easy to answer, whereas odd queries take a long time to answer.
In that case, you can't just give f as a function of n -- it would depend on other variables as well. In the end, remember that this is just a set of mathematical tools; it's your job to figure out how to apply it to your program and to figure out what's an interesting thing to measure. Using tools in a useful manner needs some creativity, and math is no exception.



回答2:

Informally speaking, best case has O(n) complexity means that when the input meets certain conditions (i.e. is best for the algorithm performed), then the count of operations performed in that best case, is linear with respect to n (e.g. is 1n or 1.5n or 5n). So if the best case is O(n), usually this means that in the best case it is exactly linear with respect to n (i.e. asymptotically no smaller and no bigger than that) - see (1). Of course, if in the best case that same algorithm can be proven to perform at most c * log N operations (where c is some constant), then this algorithm's best case complexity would be informally denoted as O(log N) and not as O(N) and people would say it is O(log N) in its best case.

Formally speaking, "the algorithm's best case complexity is O(f(n))" is an informal and wrong way of saying that "the algorithm's complexity is Ω(f(n))" (in the sense of the Knuth definition - see (2)).

See also:

(1) Wikipedia "Family of Bachmann-Landau notations"

(2) Knuth's paper "Big Omicron and Big Omega and Big Theta"

(3) Big Omega notation - what is f = Ω(g)?

(4) What is the difference between Θ(n) and O(n)?

(5) What is a plain English explanation of "Big O" notation?



回答3:

I find it easier to think of O() as about ratios than about bounds. It is defined as bounds, and so that is a valid way to think of it, but it seems a bit more useful to think about "if I double the number/size of inputs to my algorithm, does my processing time double (O(n)), quadruple (O(n^2)), etc...". Thinking about it that way makes it a little bit less abstract - at least to me...