Cython parallel prange - thread locality?

2019-08-02 13:47发布

问题:

I am iterating using prange over a list like this:

from cython.parallel import  prange, threadid

cdef int tid
cdef CythonElement tEl
cdef int a, b, c

# elList: python list of CythonElement instances is passed via function call
for n in prange(nElements, schedule='dynamic', nogil=True):
    with gil:
        tEl = elList[n]
        tid =  threadid()
        a = tEl.a
        b = tEl.b
        c = tEl.c 

        print("thread {:} elnumber {:}".format(tid, tEl.elNumber))

   #nothing is done here

    with gil:
        print("thread {:} elnumber {:}".format(tid, tEl.elNumber))

    # some other computations based on a, b and c here ...

I expect an output like this:

thread 0 elnumber 1
thread 1 elnumber 2
thread 2 elnumber 3
thread 3 elnumber 4
thread 0 elnumber 1
thread 1 elnumber 2
thread 2 elnumber 3
thread 3 elnumber 4

But i get:

thread 1 elnumber 1
thread 0 elnumber 3
thread 3 elnumber 2
thread 2 elnumber 4
thread 3 elnumber 4
thread 1 elnumber 2
thread 0 elnumber 4
thread 2 elnumber 4

So, somehow the thread local variable tEl becomes overwritten across the threads? What am i doing wrong ? Thank you!

回答1:

It looks like Cython deliberately chooses to exclude any Python variables (including Cython cdef classes) from the list of thread-local variables. Code

I suspect this is deliberate to avoid reference counting issues - they'd need to drop the reference count of all the thread-local variables at the end of the loop (it wouldn't be an insurmountable problem, but might be a big change). Therefore I think it's unlikely to be fixed, but a documentation update might be helpful.

The solution is to refactorise your loop body into a function, where every variable ends up effectively "local" to the function so that it isn't an issue:

cdef f(CythonElement tEl):
    cdef int tid
    with nogil:
        tid = threadid()
        with gil:
            print("thread {:} elnumber {:}".format(tid, tEl.elNumber))

        with gil:
            print("thread {:} elnumber {:}".format(tid, tEl.elNumber))

   # I've trimmed the function a bit for the sake of being testable

# then for the loop:
for n in prange(nElements, schedule='dynamic', nogil=True):
    with gil:
        f()


回答2:

Cython provides parallelism based on threads. The order in which threads are executed is not guaranteed, hence the disordered values for thread.

If you want tEl to be private to the thread, you should not define it globally. Try moving cdef CythonElement tEl within the prange. see http://cython-devel.python.narkive.com/atEB3yrQ/openmp-thread-private-variable-not-recognized-bug-report-discussion (part on private variables).