This question already has answers here:
Closed 8 years ago.
Possible Duplicate:
Would a multithreaded Java application exploit a multi-core machine very well?
I have a plain and simple Java thread like this running on my dual-core machine (Windows XP 32bit enviroment)
public static void main(String[] strs) {
long j = 0;
for(long i = 0; i<Long.MAX_VALUE; i++)
j++;
System.out.println(j);
}
My expectation was that it would stick to a single CPU to fully exploit the high-speed cache(since in the loop we keep operating with local variable j, hence one CPU utiliaztion would be 100% and the other would be pretty much idle.
To my suprise both of the CPUs are being utilized at around 40%~60% after the thread starts and the utilization of one CPU is slightly higher than the other.
My question is that Is there any OS load-balancing mechanism that kicks in when out-of-balance has been detected? In my case is it possible that Windows OS found that one CPU is hitting nearly 100% and the other is almost idle so it reschedules the thread to another CPU periodically?
#EDIT1
I've found a possible explanation:
http://siber.cankaya.edu.tr/ozdogan/OperatingSystems/ceng328/node130.html
When the OS executes threads, it runs each thread for a certain period of time (say 10-20ms), then saves the state of the thread, and looks for other threads to run.
Now, despite what you might think from looking at the CPU Utilization graph, the OS is actually running a lot more threads than those from your program. There are threads running UI loops, threads waiting on I/O, threads running background services, etc. Most of the threads spend most of their time blocked waiting on something.
The reason why I'm talking about this is to explain that from OS's point of view, the situation is more complex than it might look. There are a whole bunch of threads doing a whole bunch of things, and the OS is attempting to switch between them. Suppose that you wanted to implement a heuristic that if a thread used up its entire quantum the last time, then the OS will make an effort to schedule it to the same core. The OS needs to track and take into account more information, and the success of the optimization may depend on a lot of hard-to-predict factors.
In addition, the benefit of affinitizing a thread to a core are often negligible in practice, so OSes don't attempt to do it automatically. Instead, they expose a feature that allows the developer to explicitly say that a particular thread should be affinitized to a core, and then the OS will respect the decision.
That seems like a reasonable trade-off: if your thread performs better when affinitized to a core, just ask the OS to do that. But, the OS won't bother attempting to figure it out for you.
As you mention, the OS will bounce threads around. The following native code also does as you described.
int main( int argc, char** argv )
{
while( true );
return 0;
}
If you look at the process, it's constantly at 25% (using a quad-core), but the Resource Monitor from Windows 7 shows that none of the 4 cores is at constant 100%, even though core 0 is at a higher usage than the others.
The cpu may share the cache between the cores, so this behavior doesn't mean the cache is not being used.
http://en.wikipedia.org/wiki/Preemption_%28computing%29
http://en.wikipedia.org/wiki/Computer_multitasking#Preemptive_multitasking.2Ftime-sharing