Meaning of average complexity when using Big-O not

2019-02-21 13:49发布

问题:

While answering to this question a debate began in comments about complexity of QuickSort. What I remember from my university time is that QuickSort is O(n^2) in worst case, O(n log(n)) in average case and O(n log(n)) (but with tighter bound) in best case.

What I need is a correct mathematical explanation of the meaning of average complexity to explain clearly what it is about to someone who believe the big-O notation can only be used for worst-case.

What I remember if that to define average complexity you should consider complexity of algorithm for all possible inputs, count how many degenerating and normal cases. If the number of degenerating cases divided by n tend towards 0 when n get big, then you can speak of average complexity of the overall function for normal cases.

Is this definition right or is definition of average complexity different ? And if it's correct can someone state it more rigorously than I ?

回答1:

If you're looking for a formal definition, then:

Average complexity is the expected running time for a random input.



回答2:

You're right.

Big O (big Theta etc.) is used to measure functions. When you write f=O(g) it doesn't matter what f and g mean. They could be average time complexity, worst time complexity, space complexities, denote distribution of primes etc.

Worst-case complexity is a function that takes size n, and tells you what is maximum number of steps of an algorithm given input of size n.

Average-case complexity is a function that takes size n, and tells you what is expected number of steps of an algorithm given input of size n.

As you see worst-case and average-case complexity are functions, so you can use big O to express their growth.



回答3:

I think your definition is correct, but your conclusions are wrong.

It's not necessarily true that if the proportion of "bad" cases tends to 0, then the average complexity is equal to the complexity of the "normal" cases.

For example, suppose that 1/(n^2) cases are "bad" and the rest "normal", and that "bad" cases take exactly (n^4) operations, whereas "normal" cases take exactly n operations.

Then the average number of operations required is equal to:

(n^4/n^2) + n(n^2-1)/(n^2)

This function is O(n^2), but not O(n).

In practice, though, you might find that time is polynomial in all cases, and the proportion of "bad" cases shrinks exponentially. That's when you'd ignore the bad cases in calculating an average.



回答4:

Average case analysis does the following:

Take all inputs of a fixed length (say n), sum up all the running times of all instances of this length, and build the average.

The problem is you will probably have to enumerate all inputs of length n in order to come up with an average complexity.



回答5:

Let's refer Big O Notation in Wikipedia:

Let f and g be two functions defined on some subset of the real numbers. One writes f(x)=O(g(x)) as x --> infinity if ...

So what the premise of the definition states is that the function f should take a number as an input and yield a number as an output. What input number are we talking about? It's supposedly a number of elements in the sequence to be sorted. What output number could we be talking about? It could be a number of operations done to order the sequence. But stop. What is a function? Function in Wikipedia:

a function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.

Are we producing exacly one output with our prior defition? No, we don't. For a given size of a sequence we can get a wide variation of number of operations. So to ensure the definition is applicable to our case we need to reduce a set possible outcomes (number of operations) to a single value. It can be a maximum ("the worse case"), a minimum ("the best case") or an average.

The conclusion is that talking about best/worst/average case is mathematically correct and using big O notation without those in context of sorting complexity is somewhat sloppy.

On the other hand, we could be more precise and use big Theta notation instead of big O notation.