What is amortized analysis of algorithms? [closed]

2019-01-08 03:46发布

问题:

How is it different from asymptotic analysis? When do you use it, and why?

I've read some articles that seem to have been written well, like these:

  • http://www.ugrad.cs.ubc.ca/~cs320/2010W2/handouts/aa-nutshell.pdf

  • http://www.cs.princeton.edu/~fiebrink/423/AmortizedAnalysisExplained_Fiebrink.pdf

but I've still not understood fully these concepts.

So, can anyone please simplify it for me?

回答1:

Amortized analysis doesn't naively multiply the number of invocations with the worst case for one invocation.

For example, for a dynamic array that doubles in size when needed, normal asymptotic analysis would only conclude that adding an item to it costs O(n), because it might need to grow and copy all elements to the new array. Amortized analysis takes into account that in order to have to grow, n/2 items must have been added without causing a grow since the previous grow, so adding an item really only takes O(1) (the cost of O(n) is amortized over n/2 actions).

Amortized analysis is not the same as an "average performance" - amortized analysis gives a hard guarantee on what the performance will do if you do so much actions.



回答2:

There are a lot of answers to "what", but none to "why".

As everyone else has said, asymptotic analysis is about how the performance of a given operation scales to a large data set. Amortized analysis is about how the average of the performance of all of the operations on a large data set scales. Amortized analysis never gives worse bounds than asymptotic, and sometimes gives much better ones.

If you are concerned with the total running time of a longer job, the better bounds of amortized analysis are probably what you care about. Which is why scripting languages (for instance) are often happy to grow arrays and hash tables by some factor even though that is an expensive operation. (The growing can be a O(n) operation, but amortized is O(1) because you do it rarely.)

If you are doing real time programming (individual operations must complete in a predictable time), then the better bounds from amortized analysis don't matter. It doesn't matter if the operation on average was fast, if you failed to finish it in time to get back and adjust the bandsaw before it cut too far...

Which one matters in your case depends on exactly what your programming problem is.



回答3:

Asymptotic analysis

This term refers to the analysis of algorithm performance under the assumption that the data the algorithm operates on (the input) is, in layman's terms, "large enough that making it larger will not change the conclusion". Although the exact size of the input does not need to be specified (we only need an upper bound), the data set itself has to be specified.

Note that so far we have only talked about the method of analysis; we have not specified exactly which quantity we are analyzing (time complexity? space complexity?), and neither have we specified which metric we are interested in (worst case? best case? average?).

In practice the term asymptotic analysis commonly refers to upper bound time complexity of an algorithm, i.e. the worst case performance measured by total running time, which is represented by the big-Oh notation (e.g. a sorting algorithm might be O(nlogn)).

Amortized analysis

This term refers to the analysis of algorithm performance based on a specific sequence of operations that targets the worst case scenario -- that is, amortized analysis does imply that the metric is worst case performance (although it still does not say which quantity is being measured). To perform this analysis, we need to specify the size of the input, but we do not need to make any assumptions about its form.

In layman's terms, amortized analysis is picking an arbitrary size for the input and then "playing through" the algorithm. Whenever a decision that depends on the input must be made, the worst path is taken¹. After the algorithm has run to completion we divide the calculated complexity by the size of the input to produce the final result.

¹note: To be precise, the worst path that is theoretically possible. If you have a vector that dynamically doubles in size each time its capacity is exhausted, "worst case" does not mean to assume that it will need to double upon every insertion because the insertions are processed as a sequence. We are allowed to (and indeed must) use known state to mathematically eliminate as many "even worse" cases as we can, even while the input remains unknown.

The most important difference

The critical difference between asymptotic and amortized analysis is that the former is dependent on the input itself, while the latter is dependent on the sequence of operations the algorithm will execute.

Therefore:

  • asymptotic analysis allows us to assert that the complexity of the algorithm when it is given a best/worst/average case input of size approaching N is bounded by some function F(N) -- where N is a variable
  • amortized analysis allows us to assert that the complexity of the algorithm when it is given an input of unknown characteristics but known size N is no worse than the value of a function F(N) -- where N is a known value


回答4:

The answer to this is succinctly defined by the first sentence of the Amortized Analysis chapter in the book - Introduction to Algorithms:

In an amortized analysis, the time required to perform a sequence of data-structure operations is averaged over all the operations performed.

We represent the complexity of a program's growth by Asymptotic analysis - which is bounding the program's growth by a function and defining the worst, best or average case of that.

But this can be misleading in cases where there is just one case where the program's complexity reaches a peak, but in general, the program doesn't take much computation.

Hence, it makes more sense to average the cost over a sequence of operations, even though a single operation might be expensive. This is Amortized Analysis!

Amortized Analysis is an alternate to Asymptotic technique used to calculate complexity. It helps us calculating a more true complexity in terms of practicality, so as to compare and decide between two or more algorithms.



回答5:

The best reference I've found so far for understanding the amortized analysis of algorithms, is in the book Introduction to Algorithms, third edition, chapter 17: "Amortized Analysis". It's all there, explained much better than what can be found in a Stack Overflow post. You'll find the book in the library of any decent University.



回答6:

Regular asymptotic analysis looks at the performance of an individual operation asymptotically, as a function of the size of the problem. The O() notation is what indicates an asymptotic analysis.

Amortized analysis (which is also an asymptotic analysis) looks at the total performance of multiple operations on a shared datastructure.

The difference is, amortized analysis typically proves that the total computation required for M operations has a better performance guarantee than M times the worst case for the individual operation.

For example, an individual operation on a splay tree of size N can take up to O(N) time. However, a sequence of M operations on a tree of size N is bounded by O( M(1+log N) + N log N ) time, which is roughly O(log N) per operation. However, note that an amortized analysis is much stricter than an "average-case" analysis: it proves that any possible sequence of operations will satisfy its asymptotic worst case.



回答7:

Amortised analysis deals with the total cost over a number of runs of the routine, and the benefits that can be gained therein. For example searching an unsorted array of n items for a single match may take up to n comparisons and hence is o(n) complexity. However, if we know the same array is going to be searched for m items, repeating the total task would then have complexity O(m*n). However, if we sort the array in advance, the cost is O(n log(n)), and successive searches take only O(log(n)) for a sorted array. Thus the total amortised cost for m elements taking this approach is O(n*log(n) + m*log(n)). If m >= n, this equates to O(n log(n)) by pre-sorting compared to O(n^2) for not sorting. Thus the amortised cost is cheaper.

Put simply, by spending a bit extra early on, we can save a lot later.