I'm shocked to trace this simple code:
#include <thread>
void foo()
{
for (int i = 0; i < 1000000; ++i) {
std::this_thread::sleep_for(std::chrono::nanoseconds(1));
}
}
int main()
{
std::thread t(foo);
t.join();
}
Guess what ? sleep_for calls FreeLibrary everytime !
kernel32.dll!_FreeLibraryStub@4()
msvcr120d.dll!Concurrency::details::DeleteAsyncTimerAndUnloadLibrary(_TP_TIMER * timer) Line 707
msvcr120d.dll!Concurrency::details::_Timer::_Stop() Line 111
msvcr120d.dll!Concurrency::details::_Timer::~_Timer() Line 100
msvcr120d.dll!`Concurrency::wait'::`7'::TimerObj::~TimerObj()
msvcr120d.dll!Concurrency::wait(unsigned int milliseconds) Line 155
test826.exe!std::this_thread::sleep_until(const xtime * _Abs_time) Line 137
test826.exe!std::this_thread::sleep_for<__int64,std::ratio<1,1000000000> >(const std::chrono::duration<__int64,std::ratio<1,1000000000> > & _Rel_time) Line 162
test826.exe!foo() Line 6
Why sleep_for had to call FreeLibrary ?
This program will take 2 seconds with boost library, and will take > 3 minutes (lose my patience) with msvcrt (Release mode). I can't imagine.
In Visual C++ 2013, most of the C++ Standard Library concurrency functionality sits atop the Concurrency Runtime (ConcRT). ConcRT is a work-stealing runtime that provides cooperative scheduling and blocking.
Here, Concurrency::wait
uses a thread pool timer to perform the wait. It uses LoadLibrary
/FreeLibrary
to increment the reference count of the module in which the ConcRT runtime is hosted for the duration that the timer is pending. This ensures that the module is not unloaded during the wait.
I'm not a ConcRT expert (not even close), so I'm not 100% sure what is the exact scenario where the ConcRT module could be unloaded here. I do know that we made similar changes to std::thread
and _beginthreadex
, to acquire a reference to the module in which the thread callback is hosted, to ensure that the module is not unloaded while the thread is executing.
In Visual C++ 2015, the C++ Standard Library concurrency functionality was modified to sit directly atop Windows operating system primitives (e.g. CreateThread
, Sleep
, etc.) instead of ConcRT. This was done to improve performance, to resolve correctness issues when mixing use of C++ threading functionality with use of operating system functionality, and as part of a more general deemphasization of ConcRT.
Note that on Windows, sleep precision is in milliseconds and a sleep of zero milliseconds generally means "go do other useful work before coming back to me." If you compile your program with Visual C++ 2015, each call to wait_for
will in turn call Sleep(0)
, which "causes the thread to relinquish the remainder of its time slice to any other thread that is ready to run."