I am trying to understand threading in Python. I've looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I'm having trouble understanding them.
How do you clearly show tasks being divided for multi-threading?
Like others mentioned, CPython can use threads only for I\O waits due to GIL. If you want to benefit from multiple cores for CPU-bound tasks, use multiprocessing:
Python 3 has the facility of Launching parallel tasks. This makes our work easier.
It has for thread pooling and Process pooling.
The following gives an insight:
ThreadPoolExecutor Example
ProcessPoolExecutor
Here is the very simple example of CSV import using threading. [Library inclusion may differ for different purpose ]
Helper Functions:
Driver Function:
Here's a simple example: you need to try a few alternative URLs and return the contents of the first one to respond.
This is a case where threading is used as a simple optimization: each subthread is waiting for a URL to resolve and respond, in order to put its contents on the queue; each thread is a daemon (won't keep the process up if main thread ends -- that's more common than not); the main thread starts all subthreads, does a
get
on the queue to wait until one of them has done aput
, then emits the results and terminates (which takes down any subthreads that might still be running, since they're daemon threads).Proper use of threads in Python is invariably connected to I/O operations (since CPython doesn't use multiple cores to run CPU-bound tasks anyway, the only reason for threading is not blocking the process while there's a wait for some I/O). Queues are almost invariably the best way to farm out work to threads and/or collect the work's results, by the way, and they're intrinsically threadsafe so they save you from worrying about locks, conditions, events, semaphores, and other inter-thread coordination/communication concepts.
I saw a lot of examples here where no real work was being performed + they were mostly CPU bound. Here is an example of a CPU bound task that computes all prime numbers between 10 million and 10.05 million. I have used all 4 methods here
Here are the results on my Mac OSX 4 core machine
The answer from Alex Martelli helped me, however here is modified version that I thought was more useful (at least to me).