I am writing a server, and I branch each action of into a thread when the request is incoming. I do this because almost every request makes database query. I am using a threadpool library to cut down on construction/destruction of threads.
My question is though - what is a good cutoff point for I/O threads like these? I know it would just be a rough estimate, but are we talking hundreds? thousands?
EDIT:
Thank you all for your responses, it seems like I am just going to have to test it to find out my thread count ceiling. The question is though: how do I know I've hit that ceiling? What exactly should I measure?
As Pax rightly said, measure, don't guess. That what I did for DNSwitness and the results were suprising: the ideal number of threads was much higher than I thought, something like 15,000 threads to get the fastest results.
Of course, it depends on many things, that's why you must measure yourself.
Complete measures (in French only) in Combien de fils d'exécution ?.
As many threads as the CPU cores is what I've heard very often.
One thing you should keep in mind is that python (at least the C based version) uses what's called a global interpreter lock that can have a huge impact on performance on mult-core machines.
If you really need the most out of multithreaded python, you might want to consider using Jython or something.
In most cases you should allow the thread pool to handle this. If you post some code or give more details it might be easier to see if there is some reason the default behavior of the thread pool would not be best.
You can find more information on how this should work here: http://en.wikipedia.org/wiki/Thread_pool_pattern
One thing to consider is how many cores exist on the machine that will be executing the code. That represents a hard limit on how many threads can be proceeding at any given time. However, if, as in your case, threads are expected to be frequently waiting for a database to execute a query, you will probably want to tune your threads based on how many concurrent queries the database can process.
I think this is a bit of a dodge to your question, but why not fork them into processes? My understanding of networking (from the hazy days of yore, I don't really code networks at all) was that each incoming connection can be handled as a separate process, because then if someone does something nasty in your process, it doesn't nuke the entire program.