I have some codes in python3 like this:
import numpy as np
import pycuda.driver as cuda
from pycuda.compiler import SourceModule, compile
import tensorflow as tf
# create device and context
cudadevice=cuda.Device(gpuid1)
cudacontext=cudadevice.make_context()
config = tf.ConfigProto()
config.gpu_options.visible_device_list={}.format(gpuid2)
sess = tf.Session(config=config)
# compile from a .cu file
cuda_mod = SourceModule(cudaCode, include_dirs = [dir_path], no_extern_c = True, options = ['-O0'])
# in the .cu code a texture named "map" is defined as:
# texture<float4, cudaTextureType2D, cudaReadModeElementType> map;
texRef = cuda_mod.get_texref('map')
# tex is a np.ndarray with shape 256*256*4, and it is the output of a tensorflow's graph by calling sess.run()
tex = np.ascontiguousarray(tex).astype(np.float32)
tex_gpu = cuda.make_multichannel_2d_array(tex, 'C')
# error here!!!!!
texRef.set_array(tex_gpu)
and the error message:
pycuda._driver.LogicError: cuTexRefSetArray failed: peer access has not been enabled
The peer access error appeared when tensorflow
is also on use (even if gpuid1 and gpuid2 are same), but everything goes right without tensorflow
.
I found that "peer access" has something to do with communicating between GPUs (devices). But what I'm doing here is just setting a numpy
array to GPU memory as texture, so I think it has nothing to do with transferring data between different GPUs. So what's wrong with it? Thanks!
It seems I have found the solution. When texRef.set_array(tex_gpu) is inserted between cudadevice.cudacontext.push() and cudadevice.cudacontext.pop() to explicitly switch the cuda context, everything goes okay.