Tensorflow: device CUDA:0 not supported by XLA ser

2020-08-27 10:10发布

问题:

I got this when using keras with Tensorflow backend:

tensorflow.python.framework.errors_impl.InvalidArgumentError: device CUDA:0 not supported by XLA service while setting up XLA_GPU_JIT device number 0

Relevant code:

tfconfig = tf.ConfigProto()
tfconfig.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
tfconfig.gpu_options.allow_growth = True
K.tensorflow_backend.set_session(tf.Session(config=tfconfig))

tensorflow version: 1.14.0

回答1:

This could be due to your TF-default (i.e. 1st) GPU is running out of memory. If you have multiple GPUs, divert your Python program to run on other GPUs. In TF (suppose using TF-2.0-rc1), set the following:

# Specify which GPU(s) to use
os.environ["CUDA_VISIBLE_DEVICES"] = "1"  # Or 2, 3, etc. other than 0

# On CPU/GPU placement
config = tf.compat.v1.ConfigProto(allow_soft_placement=True, log_device_placement=True)
config.gpu_options.allow_growth = True
tf.compat.v1.Session(config=config)

# Note that ConfigProto disappeared in TF-2.0

Suppose, however, your environment have only one GPU, then perhaps you have no choice but ask your buddy to stop his program, then treat him a cup of coffee.



回答2:

Chairman Guo's code:

os.environ["CUDA_VISIBLE_DEVICES"] = "1" 

solved my problem of jupyter notebook kernel crashing at:

tf.keras.models.load_model(path/to/my/model)

The fatal message was:

2020-01-26 11:31:58.727326: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error

My TF's version is: 2.2.0-dev20200123. There are 2 GPUs on this system.