While questions have been asked before about linked lists versus arrays, the answers mostly boil down to what most of us probably have already learned at some point:
- Lists are good at inserting and deleting
- Arrays are good at random access
Now respectable people like Bjarne Stroustrup have argued that arrays practically always outperform linked lists because they make much better use of the caching architecture implemented in modern hardware. He also states that the performance advantage of arrays increases with their size.
While I basically understand his arguments and agree with him, I wonder if this is still true when the size of the array is significantly larger than cache size. I would say that this is the case where performance really matters.
To summarize: Do arrays still perform better than lists in most cases, even if their size is much larger than cache size and a large percentage of the operations are inserts or deletes? If yes, how can this be explained?
Arrays perform better not only because of caching, but also because of prefetching.
Caching has 2 main benefits - sequential elements may reside in the same line, so you may fetch once and use the entire line multiple times (while in a linked list your next element is elsewhere so you don't enjoy the benefit). This benefit is reduced the bigger the elements become, and is gone once your element passes a line size.
The second benefit is more subtle - you get better utilization of the cache since it's organized in a way that benefits sequential allocation. This means that an array up to the cache size may still fit, while a randomly allocated list may have some collisions that would cause thrashing even if the list size is smaller than the cache.
However, aside from caching, the bigger benefit from spatially allocated structures is from prefetching. Most CPUs would automatically prefetch the next lines in a stream of accesses such as an array access, and would therefore eliminate all misses in a scenario of sequential access.
On the other hand, all these benefits are just optimizations, so they can only speed up the performance linearly, but can never mitigate an asymptotic complexity difference such as the O(1) insertion that the list provides. Eventually, you will need to benchmark your code to see if such cases are required and create a bottleneck - if so, a hybrid approach may be needed.