From what I understand, the GIL makes it impossible to have threads that harness a core each individually.
This is a basic question, but, what is then the point of the threading
library? It seems useless if the threaded code has equivalent speed to a normal program.
In some cases an application may not utilize even one core fully and using threads (or processes) may help to do that.
Think of a typical web application. It receives requests from clients, does some queries to the database and returns data back to the client. Given that IO operation is order of magnitude slower than CPU operation most of the time such application is waiting for IO to complete. First, it waits to read the request from the socket. Then it waits till the request to the database is written into the socket opened to the DB. Then it waits for response from the database and then for response to be written to the client socket.
Waiting for IO to complete may take 90% (or more) of the time the request is processed. When single threaded application is waiting on IO it just not using the core and the core is available for execution. So such application has a room for other threads to execute even on a single core.
In this case when one thread waits for IO to complete it releases GIL and another thread can continue execution.
Strictly speeaking, CPython support multi-io-bound-thread + single-cpu-bound-thread
io bound method: file.open, file.write, file.read, socket.send, socket.recv, etc. when python call these io function, it will release GIL and acquire GIL after io function return implicitly
cpu bound method: arithmatic calculation, etc.
c extension method: method must call PyEval_SaveThread & PyEval_RestoreThread explicitly to tell the python interpreter what you are doing