I found that in Python 3.4 there are few different libraries for multiprocessing/threading: multiprocessing vs threading vs asyncio.
But I don't know which one to use or is the "recommended one". Do they do the same thing, or are different? If so, which one is used for what? I want to write a program that uses multicores in my computer. But I don't know which library I should learn.
They are intended for (slightly) different purposes and/or requirements. CPython (a typical, mainline Python implementation) still has the global interpreter lock so a multi-threaded application (a standard way to implement parallel processing nowadays) is suboptimal. That's why multiprocessing
may be preferred over threading
. But not every problem may be effectively split into [almost independent] pieces, so there may be a need in heavy interprocess communications. That's why multiprocessing
may not be preferred over threading
in general.
asyncio
(this technique is available not only in Python, other languages and/or frameworks also have it, e.g. Boost.ASIO) is a method to effectively handle a lot of I/O operations from many simultaneous sources w/o need of parallel code execution. So it's just a solution (a good one indeed!) for a particular task, not for parallel processing in general.
[Quick Answer]
TL;DR
Making the Right Choice:
We have walked through the most popular forms of concurrency. But the question remains - when should choose which one? It really depends on the use cases. From my experience (and reading), I tend to follow this pseudo code:
if io_bound:
if io_very_slow:
print("Use Asyncio")
else:
print("Use Threads")
else:
print("Multi Processing")
- CPU Bound => Multi Processing
- I/O Bound, Fast I/O, Limited Number of Connections => Multi Threading
- I/O Bound, Slow I/O, Many connections => Asyncio
Reference
[NOTE]:
- If you have a long call method (i.e. a method that contained with a sleep time or lazy I/O), the best choice is asyncio, Twisted or Tornado approach (coroutine methods), that works with a single thread as concurrency.
- asyncio works on Python3.4 and later.
- Tornado and Twisted are ready since Python2.7
- uvloop is ultra fast
asyncio
event loop (uvloop makes asyncio
2-4x faster).
[UPDATE (2019)]:
- Japranto (GitHub) is a very fast pipelining HTTP server based on uvloop.
This is the basic idea:
Is it IO-BOUND ? ---------> USE asyncio
IS IT CPU-HEAVY ? -----> USE multiprocessing
ELSE ? ----------------------> USE threading
So basically stick to threading unless you have IO/CPU problems.