Python C-Api Threading issues

2020-07-17 16:16发布

问题:

I am writing a C program which uses a networking lib written in python. I embed the python lib with the python C api. The library sends all requests async and informs me through signals when the request is done.

That means in theory.

In reality I have two threading related problems problems:

  1. All calls to the python lib from c are blocky (they should return immediately)
  2. The python lib calls the registered callbacks async (thread.start_new_thread(callback, args)). This does not work (nothing happens). If I change the python code to callback(args) then it does work.

What I am doing wrong? Is there something I have to do to make multithreading work?

回答1:

I have similar scenario.

Initial work flow

  1. Application starts from C++ layer
  2. C++ layer invokes function in Python layer in main thread
  3. The Python layer function in main thread creates an event thread
  4. Starts the event thread in Python layer and go back to C++ layer
  5. Main loop starts in C++ layer
  6. The event thread invokes callback function in C++ layer if needed

From the beginning, the event thread works unexpected. I guess this is due to GIL from the situation I encountered so I tried to solve this from GIL. Here is my solution.

Analysis

First, from note in PyEval_InitThreads,

When only the main thread exists, no GIL operations are needed. ... Therefore, the lock is not created initially. ...

So if multi-thread is needed, PyEval_InitThreads() must be called in main thread. And I call PyEval_InitThreads() before Py_Initialize(). Now GIL is initialized and main thread acquires GIL.

Second, each time before Python function is invoked from C++ layer, PyGILState_Ensure() is called to get GIL. In addition, after Python function is invoked, PyGILState_Release(state) is called to go back to previous GIL state. As a result, before step 2, PyGILState_Ensure() is called, and after step 4, PyGILState_Release(state) is called.

But there is a problem. From PyGILState_Ensure and PyGILState_Release, these two functions are to save current GIL state to get GIL and restore previous GIL state to release GIL. However, after calling PyEval_InitThreads() in main thread, main thread owns GIL definitely. And the GIL state in main thread is as follows:

/* main thread owns GIL by PyEval_InitThreads */

state = PyGILState_Ensure();
/* main thread owns GIL by PyGILState_Ensure */

...
/* invoke Python function */
...

PyGILState_Release(state);
/* main thread owns GIL due to go back to previous state */

From above code sample, main thread always owns GIL so the event thread never runs. To overcome this situation, let main thread not acquire GIL before calling PyGILState_Ensure(). Therefore, after calling PyGILState_Release(state), main thread could release GIL to let event thread run. So GIL should be released in main thread immediately when GIL is initialized.

Here PyEval_SaveThread() is used. From PyEval_SaveThread,

Release the global interpreter lock (if it has been created and thread support is enabled) and reset the thread state to NULL, ...

By doing so, embedding Python with multi-thread works.

Work flow after modification

  1. Application starts from C++ layer
  2. PyEval_InitThreads(); to enable multi-thread
  3. save = PyEval_SaveThread(); to release GIL in main thread
  4. state = PyGILState_Ensure(); to acquire GIL in main thread
  5. C++ layer invokes function in Python layer in main thread
  6. The Python layer function in main thread creates an event thread
  7. Starts the event thread in Python layer and go back to C++ layer
  8. PyGILState_Release(state); to release GIL in main thread
  9. Main loop starts in C++ layer
  10. The event thread invokes callback function in C++ layer if needed