I am writing some code that spawns quite a few threads (about 512 at the moment, but that could get higher in the future). Each of the threads only performs a small amount of operations, so I want the overhead that the threads place on the system to be kept at a minimum.
I am setting the stack size using pthread_attr_setstacksize()
, and I can get the minimal allowed stack size from PTHREAD_STACK_MIN
. But my question is: Is it safe to use PTHREAD_STACK_MIN
for the thread stack size? How do I go about calculating how much stack I need? Are there any hidden overheads that I will need to add on to my calculation?
Also, are there any other techniques I can use to reduce the threads' burden on the system?
Reducing the thread stack size will not reduce overhead (not in terms of CPU, memory use or performance). Your only limit in this respect is the total available virtual address space given to threads on your platform.
I would use the default stack size until a platform presents problems otherwise (if it happens at all). Then minimize stack usage if and when problems arise. However these will lead to real performance issues, as you'll need to hit up the heap, or devise thread-dependent allocation elsewhere.
Hidden overheads may include:
- Allocation of large arrays on the stack, such as by VLA,
alloca()
or just plain statically sized automatic arrays.
- Code you don't control or weren't aware of the consequences of using such as templates, factory classes etc. However given that you did not specify C++, this is less likely to be a problem.
- Imported code from libraries headers etc. These may change between versions and significantly alter their stack, or even thread usage.
- Recursion. This occurs due to the above points also, consider things like
boost::bind
, variadic templates, crazy macros, and then just general recursion using buffers or large objects on the stack.
You can in addition to setting the stack size, manipulate the thread priorities, and suspend and resume them as required, which will significantly assist the scheduler and system responsiveness. Pthreads allow you to set contention scope; LWP and in scope scheduling vary widely in their performance characteristics.
Here are some useful links:
- Improving Performance through Threads
- linux pthread_suspend
You shouldn't be creating anywhere near that many threads, and you definitely shouldn't be making a new thread to do a small amount of operations. You should create a new thread if and only if your existing thread(s) are fully saturated AND there are more available physical or logical cares to do work. That puts a hard limit on a reasonable current application at about 10 threads or so, even if you ran on a hexacore you'd only need 12 or so at max. Such a design is very flawed, will use a huge amount of process memory, and won't really improve performance.
As for the stack size, you can't really compute how much you need for an arbitrary thread, as it totally depends on the code run. However, in Visual Studio, the typical stack size is a few megabytes. You would have to post the entire code AND the resulting disassembly executed by the thread to know how much stack size to use. Just stick it at a couple of megabytes.
The required size of stack frames depend on the compiler you use, basically you could try to guess the size of your auto variables, parameters and some overhead for return adress, saving registers etc.
You should consider whether it would be an alternative to use a Thread Pool. Since the creation of a thread isn't for free.