I have a memory heap manager which partitions the heap into different segments based on the number of processors on the system. Memory can only be allocated on the partition that goes with the currently running thread's processor. This will help allow different processors to continue running even if two different ones want to allocate memory at the same time, at least I believe.
I have found the function GetCurrentProcessorNumber()
for Windows, but this only works on Windows Vista and later. Is there a method that works on Windows XP?
Also, can this be done with pthreads on a POSIX system?
From output of
man sched_getcpu
:Unfortunately, this is Linux specific. I doubt there is a portable way to do this.
In addition to Antony Vennard's answer and the code on the cited site, here is code that will work for Visual C++ x64 as well (no inline assembler):
A short look at the implementation of GetCurrentProcessorNumber() on Win7 x64 shows that they use a different mechanism to get the processor number, but in my (few) tests the results were the same for my home-brewn and the official function.
For XP, a quick google as revealed this: https://www.cs.tcd.ie/Jeremy.Jones/GetCurrentProcessorNumberXP.htm Does this help?
This design smells bad to me. You seem to be making the assumption that a thread will stay associated with a specific CPU. That is not guaranteed. Yes, a thread may normally stay on a single CPU, but it doesn't have to, and eventually your program will have a thread that switches CPU's. It may not happen often, but eventually it will. If your design doesn't take this into account, then you will mostly likely eventually hit some sort of hard to trace bug.
Let me ask this question, what happens if memory is allocated on one CPU and freed on another? How will your heap handle that?
If all you want to do is avoid contention, you don't need to know the current CPU. You could just randomly pick a heap. Or you could have a heap per thread. Although you may get more or less contention that way, you would avoid the overhead of polling the current CPU, which may or may not be significant. Also check out Intel Thread Building Block's scalable_allocator, which may have already solved that problem better than you will.