I have a 123MB big int
array, and it is basically used like this:
private static int[] data = new int[32487834];
static int eval(int[] c)
{
int p = data[c[0]];
p = data[p + c[1]];
p = data[p + c[2]];
p = data[p + c[3]];
p = data[p + c[4]];
p = data[p + c[5]];
return data[p + c[6]];
}
eval()
is called a lot (~50B times) with different c
and I would like to know if (and how) I could speed it up.
I already use a unsafe function with an fixed array that makes use of all the CPUs. It's a C# port of the TwoPlusTwo 7 card evaluator by RayW. The C++ version is insignificantly faster.
Can the GPU be used to speed this up?
- Cache the array reference into a local variable. Static field accesses are generally slower than locals for multiple reasons (one of them is that the field can change so it has to be reloaded all the time. The JIT can optimize locals much more freely).
- Don't use an array as the argument to the method. Hard-code 7 integer-indices. That reduces array allocation, indirection-penalty and bounds checking.
- Use unsafe code to index into the array. This will eliminate bounds checking. Use a
GCHandle
to fix the array and cache the pointer in a static field (don't just use a fixed-block - I believe it has certain (small) overhead associated with entering it. Not sure).
- As an alternative to fixing the array, allocate the 123MB array using
VirtualAlloc
and use huge pages. That cuts down on TLB misses.
All of these are hardcore low-level optimizations. They only apply if you need maximum performance.
I think we are pretty much at the limit here when it comes to optimizing this function. We probably can only do better if you show the caller of the function so that they can be optimized as a single unit.