I'm doing a raytracer hobby project, and originally I was using structs for my Vector and Ray objects, and I thought a raytracer was the perfect situation to use them: you create millions of them, they don't live longer than a single method, they're lightweight. However, by simply changing 'struct' to 'class' on Vector and Ray, I got a very significant performance gain.
What gives? They're both small (3 floats for Vector, 2 Vectors for a Ray), don't get copied around excessively. I do pass them to methods when needed of course, but that's inevitable. So what are the common pitfalls that kill performance when using structs? I've read this MSDN article that says the following:
When you run this example, you'll see that the struct loop is orders of magnitude faster. However, it is important to beware of using ValueTypes when you treat them like objects. This adds extra boxing and unboxing overhead to your program, and can end up costing you more than it would if you had stuck with objects! To see this in action, modify the code above to use an array of foos and bars. You'll find that the performance is more or less equal.
It's however quite old (2001) and the whole "putting them in an array causes boxing/unboxing" struck me as odd. Is that true? However, I did pre-calculate the primary rays and put them in an array, so I took up on this article and calculated the primary ray when I needed it and never added them to an array, but it didn't change anything: with classes, it was still 1.5x faster.
I am running .NET 3.5 SP1 which I believe fixed an issue where struct methods weren't ever in-lined, so that can't be it either.
So basically: any tips, things to consider and what to avoid?
EDIT: As suggested in some answers, I've set up a test project where I've tried passing structs as ref. The methods for adding two Vectors:
public static VectorStruct Add(VectorStruct v1, VectorStruct v2)
{
return new VectorStruct(v1.X + v2.X, v1.Y + v2.Y, v1.Z + v2.Z);
}
public static VectorStruct Add(ref VectorStruct v1, ref VectorStruct v2)
{
return new VectorStruct(v1.X + v2.X, v1.Y + v2.Y, v1.Z + v2.Z);
}
public static void Add(ref VectorStruct v1, ref VectorStruct v2, out VectorStruct v3)
{
v3 = new VectorStruct(v1.X + v2.X, v1.Y + v2.Y, v1.Z + v2.Z);
}
For each I got a variation of the following benchmark method:
VectorStruct StructTest()
{
Stopwatch sw = new Stopwatch();
sw.Start();
var v2 = new VectorStruct(0, 0, 0);
for (int i = 0; i < 100000000; i++)
{
var v0 = new VectorStruct(i, i, i);
var v1 = new VectorStruct(i, i, i);
v2 = VectorStruct.Add(ref v0, ref v1);
}
sw.Stop();
Console.WriteLine(sw.Elapsed.ToString());
return v2; // To make sure v2 doesn't get optimized away because it's unused.
}
All seem to perform pretty much identical. Is it possible that they get optimized by the JIT to whatever is the optimal way to pass this struct?
EDIT2: I must note by the way that using structs in my test project is about 50% faster than using a class. Why this is different for my raytracer I don't know.
The first thing I would look for is to make sure that you have explicitly implemented Equals and GetHashCode. Failing to do this means that the runtime implementation of each of these does some very expensive operations to compare two struct instances (internally it uses reflection to determine each of the private fields and then checkes them for equality, this causes a significant amount of allocation).
Generally though, the best thing you can do is to run your code under a profiler and see where the slow parts are. It can be an eye-opening experience.
Anything written regarding boxing/unboxing prior to .NET generics can be taken with something of a grain of salt. Generic collection types have removed the need for boxing and unboxing of value types, which makes using structs in these situations more valuable.
As for your specific slowdown - we'd probably need to see some code.
I use structs basically for parameter objects, returning multiple pieces of information from a function, and... nothing else. Don't know whether it's "right" or "wrong," but that's what I do.
I think the key lies in these two statements from your post:
and
Now unless your struct is less than or equal to 4 bytes in size (or 8 bytes if you are on a 64-bit system) you are copying much more on each method call then if you simply passed an object reference.
Basically, don't make them too big, and pass them around by ref when you can. I discovered this the exact same way... By changing my Vector and Ray classes to structs.
With more memory being passed around, it's bound to cause cache thrashing.
My own ray tracer also uses struct Vectors (though not Rays) and changing Vector to class does not appear to have any impact on the performance. I'm currently using three doubles for the vector so it might be bigger than it ought to be. One thing to note though, and this might be obvious but it wasn't for me, and that is to run the program outside of visual studio. Even if you set it to optimized release build you can get a massive speed boost if you start the exe outside of VS. Any benchmarking you do should take this into consideration.