CUDA ERROR: initialization error when using parall

2019-08-14 18:58发布

I use CUDA for my code, but it still slow run. Therefore I change it to run parallel using multiprocessing (pool.map) in python. But I have CUDA ERROR: initialization error

This Is function :

def step_M(self, iter_training):
    gpe, e_tuple_list = iter_training
    g = gpe[0]
    p = gpe[1]
    em_iters = gpe[2]

    e_tuple_list = sorted(e_tuple_list, key=lambda tup: tup[0])
    data = self.X[e_tuple_list[0][0]:e_tuple_list[0][1]]
    cluster_indices = np.array(range(e_tuple_list[0][0], e_tuple_list[0][1], 1), dtype=np.int32)
    for i in range(1, len(e_tuple_list)):
        d = e_tuple_list[i]
        cluster_indices = np.concatenate((cluster_indices, np.array(range(d[0], d[1], 1), dtype=np.int32)))
        data = np.concatenate((data, self.X[d[0]:d[1]]))

    g.train_on_subset(self.X, cluster_indices, max_em_iters=em_iters)
    return g, cluster_indices, data

And here code call:

pool = Pool()
iter_bic_list = pool.map(self.step_M, iter_training.items())

The iter_training same: enter image description here

And this is errors enter image description here could you help me to fix.Thanks you.

3条回答
We Are One
2楼-- · 2019-08-14 19:43

Try

sudo ldconfig /usr/local/cuda/lib64
查看更多
等我变得足够好
3楼-- · 2019-08-14 19:44

I found this is a problem with cuda putting a mutex for a process ID. So when you use the multiprocessing module another subprocess with a separate pid is spawned. And it is not able to access because of the mutex for the GPU.

A quick solution which I found to be working is using the threading module instead of the multiprocessing module.

So basically the same pid which loads the network in the gpu should use it.

查看更多
Explosion°爆炸
4楼-- · 2019-08-14 19:46

I realize this is a bit old but I ran into the same problem, while running under celery in my case:

syncedmem.cpp:63] Check failed: error == cudaSuccess (3 vs. 0)  initialization error

Switching from prefork to an eventlet based pool has resolved the issue. Your code could be updated similarly to:

from eventlet import GreenPool
pool = GreenPool()
iter_bic_list = list(pool.imap(self.step_M, iter_training.items()))
查看更多
登录 后发表回答