Say you need to have a list/array of integers which you need iterate frequently, and I mean extremely often. The reasons may vary, but say it's in the heart of the inner most loop of a high volume processing.
In general, one would opt for using Lists (List) due to their flexibility in size. On top of that, msdn documentation claims Lists use an array internally and should perform just as fast (a quick look with Reflector confirms this). Neverless, there is some overhead involved.
Did anyone actually measure this? would iterating 6M times through a list take the same time as an array would?
Very easy to measure...
In a small number of tight-loop processing code where I know the length is fixed I use arrays for that extra tiny bit of micro-optimisation; arrays can be marginally faster if you use the indexer / for form - but IIRC believe it depends on the type of data in the array. But unless you need to micro-optimise, keep it simple and use
List<T>
etc.Of course, this only applies if you are reading all of the data; a dictionary would be quicker for key-based lookups.
Here's my results using "int" (the second number is a checksum to verify they all did the same work):
(edited to fix bug)
based on the test rig:
In some brief tests I have found a combination of the two to be better in what I would call reasonably intensive Math:
Type:
List<double[]>
Type:
List<List<double>>
Type:
double[rows * columns]
Running the Code:
I do wish we had some top notch Hardware Accelerated Matrix Classes like the .NET Team have done with the
System.Numerics.Vectors
Class!C# could be the best ML Language with a bit more work in this area!
The measurements are nice, but you are going to get significantly different results depending on what you're doing exactly in your inner loop. Measure your own situation. If you're using multi-threading, that alone is a non-trivial activity.
if you are just getting a single value out of either (not in a loop) then both do bounds checking (you're in managed code remember) it's just the list does it twice. See the notes later for why this is likely not a big deal.
If you are using your own for(int int i = 0; i < x.[Length/Count];i++) then the key difference is as follows:
If you are using foreach then the key difference is as follows:
The bounds checking is often no big deal (especially if you are on a cpu with a deep pipeline and branch prediction - the norm for most these days) but only your own profiling can tell you if that is an issue. If you are in parts of your code where you are avoiding heap allocations (good examples are libraries or in hashcode implementations) then ensuring the variable is typed as List not IList will avoid that pitfall. As always profile if it matters.
Do not attempt to add capacity by increasing the number of elements.
Performance
Indeed, if you perform some complex calculations inside the loop, then the performance of the array indexer versus the list indexer may be so marginally small, that eventually, it doesn't matter.