可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Say you need to have a list/array of integers which you need iterate frequently, and I mean extremely often. The reasons may vary, but say it\'s in the heart of the inner most loop of a high volume processing.
In general, one would opt for using Lists (List) due to their flexibility in size. On top of that, msdn documentation claims Lists use an array internally and should perform just as fast (a quick look with Reflector confirms this). Neverless, there is some overhead involved.
Did anyone actually measure this? would iterating 6M times through a list take the same time as an array would?
回答1:
Very easy to measure...
In a small number of tight-loop processing code where I know the length is fixed I use arrays for that extra tiny bit of micro-optimisation; arrays can be marginally faster if you use the indexer / for form - but IIRC believe it depends on the type of data in the array. But unless you need to micro-optimise, keep it simple and use List<T>
etc.
Of course, this only applies if you are reading all of the data; a dictionary would be quicker for key-based lookups.
Here\'s my results using \"int\" (the second number is a checksum to verify they all did the same work):
(edited to fix bug)
List/for: 1971ms (589725196)
Array/for: 1864ms (589725196)
List/foreach: 3054ms (589725196)
Array/foreach: 1860ms (589725196)
based on the test rig:
using System;
using System.Collections.Generic;
using System.Diagnostics;
static class Program
{
static void Main()
{
List<int> list = new List<int>(6000000);
Random rand = new Random(12345);
for (int i = 0; i < 6000000; i++)
{
list.Add(rand.Next(5000));
}
int[] arr = list.ToArray();
int chk = 0;
Stopwatch watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = list.Count;
for (int i = 0; i < len; i++)
{
chk += list[i];
}
}
watch.Stop();
Console.WriteLine(\"List/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
for (int i = 0; i < arr.Length; i++)
{
chk += arr[i];
}
}
watch.Stop();
Console.WriteLine(\"Array/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in list)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"List/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in arr)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"Array/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
Console.ReadLine();
}
}
回答2:
Short answer:
More detailed answer you will find by the following link: https://stackoverflow.com/a/29263914/4423545
回答3:
I think the performance will be quite similar.
The overhead that is involved when using a List vs an Array is, IMHO when you add items to the list, and when the list has to increase the size of the array that it\'s using internally, when the capacity of the array is reached.
Suppose you have a List with a Capacity of 10, then the List will increase it\'s capacity once you want to add the 11th element.
You can decrease the performance impact by initializing the Capacity of the list to the number of items it will hold.
But, in order to figure out if iterating over a List is as fast as iterating over an array, why don\'t you benchmark it ?
int numberOfElements = 6000000;
List<int> theList = new List<int> (numberOfElements);
int[] theArray = new int[numberOfElements];
for( int i = 0; i < numberOfElements; i++ )
{
theList.Add (i);
theArray[i] = i;
}
Stopwatch chrono = new Stopwatch ();
chrono.Start ();
int j;
for( int i = 0; i < numberOfElements; i++ )
{
j = theList[i];
}
chrono.Stop ();
Console.WriteLine (String.Format(\"iterating the List took {0} msec\", chrono.ElapsedMilliseconds));
chrono.Reset();
chrono.Start();
for( int i = 0; i < numberOfElements; i++ )
{
j = theArray[i];
}
chrono.Stop ();
Console.WriteLine (String.Format(\"iterating the array took {0} msec\", chrono.ElapsedMilliseconds));
Console.ReadLine();
On my system; iterating over the array took 33msec; iterating over the list took 66msec.
To be honest, I didn\'t expect that the variation would be that much.
So, I\'ve put my iteration in a loop: now, I execute both iteration 1000 times.
The results are:
iterating the List took 67146 msec
iterating the array took 40821 msec
Now, the variation is not that large anymore, but still ...
Therefore, I\'ve started up .NET Reflector, and the getter of the indexer of the List class, looks like this:
public T get_Item(int index)
{
if (index >= this._size)
{
ThrowHelper.ThrowArgumentOutOfRangeException();
}
return this._items[index];
}
As you can see, when you use the indexer of the List, the List performs a check whether you\'re not going out of the bounds of the internal array. This additional check comes with a cost.
回答4:
if you are just getting a single value out of either (not in a loop) then both do bounds checking (you\'re in managed code remember) it\'s just the list does it twice.
See the notes later for why this is likely not a big deal.
If you are using your own for(int int i = 0; i < x.[Length/Count];i++) then the key difference is as follows:
- Array:
- bounds checking is removed
- Lists
- bounds checking is performed
If you are using foreach then the key difference is as follows:
- Array:
- no object is allocated to manage the iteration
- bounds checking is removed
- List via a variable known to be List.
- the iteration management variable is stack allocated
- bounds checking is performed
- List via a variable known to be IList.
- the iteration management variable is heap allocated
- bounds checking is performed
also Lists values may not be altered during the foreach whereas the array\'s can be.
The bounds checking is often no big deal (especially if you are on a cpu with a deep pipeline and branch prediction - the norm for most these days) but only your own profiling can tell you if that is an issue.
If you are in parts of your code where you are avoiding heap allocations (good examples are libraries or in hashcode implementations) then ensuring the variable is typed as List not IList will avoid that pitfall.
As always profile if it matters.
回答5:
[See also this question]
I\'ve modified Marc\'s answer to use actual random numbers and actually do the same work in all cases.
Results:
for foreach
Array : 1575ms 1575ms (+0%)
List : 1630ms 2627ms (+61%)
(+3%) (+67%)
(Checksum: -1000038876)
Compiled as Release under VS 2008 SP1. Running without debugging on a Q6600@2.40GHz, .NET 3.5 SP1.
Code:
class Program
{
static void Main(string[] args)
{
List<int> list = new List<int>(6000000);
Random rand = new Random(1);
for (int i = 0; i < 6000000; i++)
{
list.Add(rand.Next());
}
int[] arr = list.ToArray();
int chk = 0;
Stopwatch watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = list.Count;
for (int i = 0; i < len; i++)
{
chk += list[i];
}
}
watch.Stop();
Console.WriteLine(\"List/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = arr.Length;
for (int i = 0; i < len; i++)
{
chk += arr[i];
}
}
watch.Stop();
Console.WriteLine(\"Array/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in list)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"List/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in arr)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"Array/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
Console.WriteLine();
Console.ReadLine();
}
}
回答6:
The measurements are nice, but you are going to get significantly different results depending on what you\'re doing exactly in your inner loop. Measure your own situation. If you\'re using multi-threading, that alone is a non-trivial activity.
回答7:
Indeed, if you perform some complex calculations inside the loop, then the performance of the array indexer versus the list indexer may be so marginally small, that eventually, it doesn\'t matter.
回答8:
Do not attempt to add capacity by increasing the number of elements.
Performance
List For Add: 1ms
Array For Add: 2397ms
Stopwatch watch;
#region --> List For Add <--
List<int> intList = new List<int>();
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 60000; rpt++)
{
intList.Add(rand.Next());
}
watch.Stop();
Console.WriteLine(\"List For Add: {0}ms\", watch.ElapsedMilliseconds);
#endregion
#region --> Array For Add <--
int[] intArray = new int[0];
watch = Stopwatch.StartNew();
int sira = 0;
for (int rpt = 0; rpt < 60000; rpt++)
{
sira += 1;
Array.Resize(ref intArray, intArray.Length + 1);
intArray[rpt] = rand.Next();
}
watch.Stop();
Console.WriteLine(\"Array For Add: {0}ms\", watch.ElapsedMilliseconds);
#endregion
回答9:
Here\'s one that uses Dictionaries, IEnumerable:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
static class Program
{
static void Main()
{
List<int> list = new List<int>(6000000);
for (int i = 0; i < 6000000; i++)
{
list.Add(i);
}
Console.WriteLine(\"Count: {0}\", list.Count);
int[] arr = list.ToArray();
IEnumerable<int> Ienumerable = list.ToArray();
Dictionary<int, bool> dict = list.ToDictionary(x => x, y => true);
int chk = 0;
Stopwatch watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = list.Count;
for (int i = 0; i < len; i++)
{
chk += list[i];
}
}
watch.Stop();
Console.WriteLine(\"List/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
for (int i = 0; i < arr.Length; i++)
{
chk += arr[i];
}
}
watch.Stop();
Console.WriteLine(\"Array/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in Ienumerable)
{
chk += i;
}
}
Console.WriteLine(\"Ienumerable/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in dict.Keys)
{
chk += i;
}
}
Console.WriteLine(\"Dict/for: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in list)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"List/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in arr)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"Array/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in Ienumerable)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"Ienumerable/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in dict.Keys)
{
chk += i;
}
}
watch.Stop();
Console.WriteLine(\"Dict/foreach: {0}ms ({1})\", watch.ElapsedMilliseconds, chk);
Console.ReadLine();
}
}
回答10:
I was worried that the Benchmarks posted in other answers would still leave room for the compiler to optimize, eliminate or merge loops so I wrote one that:
- Used unpredictable inputs (random)
- Runs a calculated with the result printed to the console
- Modifies the input data each repetition
The result as that a direct array has about 250% better performance than an access to an array wrapped in an IList:
- 1 billion array accesses: 4000 ms
- 1 billion list accesses: 10000 ms
- 100 million array accesses: 350 ms
- 100 million list accesses: 1000 ms
Here\'s the code:
static void Main(string[] args) {
const int TestPointCount = 1000000;
const int RepetitionCount = 1000;
Stopwatch arrayTimer = new Stopwatch();
Stopwatch listTimer = new Stopwatch();
Point2[] points = new Point2[TestPointCount];
var random = new Random();
for (int index = 0; index < TestPointCount; ++index) {
points[index].X = random.NextDouble();
points[index].Y = random.NextDouble();
}
for (int repetition = 0; repetition <= RepetitionCount; ++repetition) {
if (repetition > 0) { // first repetition is for cache warmup
arrayTimer.Start();
}
doWorkOnArray(points);
if (repetition > 0) { // first repetition is for cache warmup
arrayTimer.Stop();
}
if (repetition > 0) { // first repetition is for cache warmup
listTimer.Start();
}
doWorkOnList(points);
if (repetition > 0) { // first repetition is for cache warmup
listTimer.Stop();
}
}
Console.WriteLine(\"Ignore this: \" + points[0].X + points[0].Y);
Console.WriteLine(
string.Format(
\"{0} accesses on array took {1} ms\",
RepetitionCount * TestPointCount, arrayTimer.ElapsedMilliseconds
)
);
Console.WriteLine(
string.Format(
\"{0} accesses on list took {1} ms\",
RepetitionCount * TestPointCount, listTimer.ElapsedMilliseconds
)
);
}
private static void doWorkOnArray(Point2[] points) {
var random = new Random();
int pointCount = points.Length;
Point2 accumulated = Point2.Zero;
for (int index = 0; index < pointCount; ++index) {
accumulated.X += points[index].X;
accumulated.Y += points[index].Y;
}
accumulated /= pointCount;
// make use of the result somewhere so the optimizer can\'t eliminate the loop
// also modify the input collection so the optimizer can merge the repetition loop
points[random.Next(0, pointCount)] = accumulated;
}
private static void doWorkOnList(IList<Point2> points) {
var random = new Random();
int pointCount = points.Count;
Point2 accumulated = Point2.Zero;
for (int index = 0; index < pointCount; ++index) {
accumulated.X += points[index].X;
accumulated.Y += points[index].Y;
}
accumulated /= pointCount;
// make use of the result somewhere so the optimizer can\'t eliminate the loop
// also modify the input collection so the optimizer can merge the repetition loop
points[random.Next(0, pointCount)] = accumulated;
}
回答11:
Since List<> uses arrays internally, the basic performance should be the same. Two reasons, why the List might be slightly slower:
- To look up a element in the list, a method of List is called, which does the look up in the underlying array. So you need an additional method call there. On the other hand the compiler might recognize this and optimize the \"unnecessary\" call away.
- The compiler might do some special optimizations if it knows the size of the array, that it can\'t do for a list of unknown length. This might bring some performance improvement if you only have a few elements in your list.
To check if it makes any difference for you, it\'s probably best adjust the posted timing functions to a list of the size you\'re planning to use and see how the results for your special case are.
回答12:
Since I had a similar question this got me a fast start.
My question is a bit more specific, \'what is the fastest method for a reflexive array implementation\'
The testing done by Marc Gravell shows a lot, but not exactly access timing. His timing include the looping over the array\'s and lists as well. Since I also came up with a third method that I wanted to test, a \'Dictionary\', just to compare, I extended hist test code.
Firts, I do a test using a constant, which gives me a certain timing including the loop. This is a \'bare\' timing, excluding the actual access.
Then I do a test with accessing the subject structure, this gives me and \'overhead included\' timing, looping and actual access.
The difference between \'bare\' timing and \'overhead indluded\' timing gives me an indication of the \'structure access\' timing.
But how accurate is this timing? During the test windows will do some time slicing for shure. I have no information about the time slicing but I asume it is evenly distributed during the test and in the order of tens of msec which means that the accuracy for the timing should be in the order of +/- 100 msec or so. A bit rough estimate? Anyway a source of a systematic mearure error.
Also, the tests were done in \'Debug\' mode with no optimalisation. Otherwise the compiler might change the actual test code.
So, I get two results, one for a constant, marked \'(c)\', and one for access marked \'(n)\' and the difference \'dt\' tells me how much time the actual access takes.
And this are the results:
Dictionary(c)/for: 1205ms (600000000)
Dictionary(n)/for: 8046ms (589725196)
dt = 6841
List(c)/for: 1186ms (1189725196)
List(n)/for: 2475ms (1779450392)
dt = 1289
Array(c)/for: 1019ms (600000000)
Array(n)/for: 1266ms (589725196)
dt = 247
Dictionary[key](c)/foreach: 2738ms (600000000)
Dictionary[key](n)/foreach: 10017ms (589725196)
dt = 7279
List(c)/foreach: 2480ms (600000000)
List(n)/foreach: 2658ms (589725196)
dt = 178
Array(c)/foreach: 1300ms (600000000)
Array(n)/foreach: 1592ms (589725196)
dt = 292
dt +/-.1 sec for foreach
Dictionary 6.8 7.3
List 1.3 0.2
Array 0.2 0.3
Same test, different system:
dt +/- .1 sec for foreach
Dictionary 14.4 12.0
List 1.7 0.1
Array 0.5 0.7
With better estimates on the timing errors (how to remove the systematic measurement error due to time slicing?) more could be said about the results.
It looks like List/foreach has the fastest access but the overhead is killing it.
The difference between List/for and List/foreach is stange. Maybe some cashing is involved?
Further, for access to an array it does not matter if you use a for
loop or a foreach
loop. The timing results and its accuracity makes the results \'comparible\'.
Using a dictionary is by far the slowest, I only considered it because on the left side (the indexer) I have a sparse list of integers and not a range as is used in this tests.
Here is the modified test code.
Dictionary<int, int> dict = new Dictionary<int, int>(6000000);
List<int> list = new List<int>(6000000);
Random rand = new Random(12345);
for (int i = 0; i < 6000000; i++)
{
int n = rand.Next(5000);
dict.Add(i, n);
list.Add(n);
}
int[] arr = list.ToArray();
int chk = 0;
Stopwatch watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = dict.Count;
for (int i = 0; i < len; i++)
{
chk += 1; // dict[i];
}
}
watch.Stop();
long c_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" Dictionary(c)/for: {0}ms ({1})\", c_dt, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = dict.Count;
for (int i = 0; i < len; i++)
{
chk += dict[i];
}
}
watch.Stop();
long n_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" Dictionary(n)/for: {0}ms ({1})\", n_dt, chk);
Console.WriteLine(\"dt = {0}\", n_dt - c_dt);
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = list.Count;
for (int i = 0; i < len; i++)
{
chk += 1; // list[i];
}
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" List(c)/for: {0}ms ({1})\", c_dt, chk);
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
int len = list.Count;
for (int i = 0; i < len; i++)
{
chk += list[i];
}
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" List(n)/for: {0}ms ({1})\", n_dt, chk);
Console.WriteLine(\"dt = {0}\", n_dt - c_dt);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
for (int i = 0; i < arr.Length; i++)
{
chk += 1; // arr[i];
}
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" Array(c)/for: {0}ms ({1})\", c_dt, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
for (int i = 0; i < arr.Length; i++)
{
chk += arr[i];
}
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\"Array(n)/for: {0}ms ({1})\", n_dt, chk);
Console.WriteLine(\"dt = {0}\", n_dt - c_dt);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in dict.Keys)
{
chk += 1; // dict[i]; ;
}
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\"Dictionary[key](c)/foreach: {0}ms ({1})\", c_dt, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in dict.Keys)
{
chk += dict[i]; ;
}
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\"Dictionary[key](n)/foreach: {0}ms ({1})\", n_dt, chk);
Console.WriteLine(\"dt = {0}\", n_dt - c_dt);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in list)
{
chk += 1; // i;
}
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" List(c)/foreach: {0}ms ({1})\", c_dt, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in list)
{
chk += i;
}
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" List(n)/foreach: {0}ms ({1})\", n_dt, chk);
Console.WriteLine(\"dt = {0}\", n_dt - c_dt);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in arr)
{
chk += 1; // i;
}
}
watch.Stop();
c_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\" Array(c)/foreach: {0}ms ({1})\", c_dt, chk);
chk = 0;
watch = Stopwatch.StartNew();
for (int rpt = 0; rpt < 100; rpt++)
{
foreach (int i in arr)
{
chk += i;
}
}
watch.Stop();
n_dt = watch.ElapsedMilliseconds;
Console.WriteLine(\"Array(n)/foreach: {0}ms ({1})\", n_dt, chk);
Console.WriteLine(\"dt = {0}\", n_dt - c_dt);
回答13:
In some brief tests I have found a combination of the two to be better in what I would call reasonably intensive Math:
Type: List<double[]>
Time: 00:00:05.1861300
Type: List<List<double>>
Time: 00:00:05.7941351
Type: double[rows * columns]
Time: 00:00:06.0547118
Running the Code:
int rows = 10000;
int columns = 10000;
IMatrix Matrix = new IMatrix(rows, columns);
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int r = 0; r < Matrix.Rows; r++)
for (int c = 0; c < Matrix.Columns; c++)
Matrix[r, c] = Math.E;
for (int r = 0; r < Matrix.Rows; r++)
for (int c = 0; c < Matrix.Columns; c++)
Matrix[r, c] *= -Math.Log(Math.E);
stopwatch.Stop();
TimeSpan ts = stopwatch.Elapsed;
Console.WriteLine(ts.ToString());
I do wish we had some top notch Hardware Accelerated Matrix Classes like the .NET Team have done with the System.Numerics.Vectors
Class!
C# could be the best ML Language with a bit more work in this area!