I want to specify the gpu to run my process. And I set it as follows:
import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant(3.0)
with tf.Session() as sess:
while True:
print sess.run(a)
However it still allocate memory in both my two gpus.
| 0 7479 C python 5437MiB
| 1 7479 C python 5437MiB
I believe that you need to set CUDA_VISIBLE_DEVICES=1
. Or which ever GPU you want to use. If you make only one GPU visible, you will refer to it as /gpu:0
regardless of what you set the environment variable to.
More info on that environment variable: https://devblogs.nvidia.com/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/
There are 3 ways to achieve this:
Using CUDA_VISIBLE_DEVICES
environment variable.
by setting setting environment variable CUDA_VISIBLE_DEVICES=1
makes only device 1 visible and by setting CUDA_VISIBLE_DEVICES=0,1
makes devices 0 and 1 visible.
Using with tf.device('/gpu:2')
and the create the graph. Then it will use GPU device 2 to run.
Using config = tf.ConfigProto(device_count = {'GPU': 1})
and then sess = tf.Session(config=config)
. This will use GPU device 1.
TF would allocate all available memory on each visible GPU if not told otherwise.
You can set CUDA_VISIBLE_DEVICES=0,1
if it is a onetime thing.
If you are using a cluster and you dont want to keep track of which GPU is busy and manually type in the info you can call next method before constructing a session. It will filter out GPUs that are already in use (don't have much of available memory) and set CUDA_VISIBLE_DEVICES for you.
The function:
import subprocess as sp
import os
def mask_unused_gpus(leave_unmasked=1):
ACCEPTABLE_AVAILABLE_MEMORY = 1024
COMMAND = "nvidia-smi --query-gpu=memory.free --format=csv"
try:
_output_to_list = lambda x: x.decode('ascii').split('\n')[:-1]
memory_free_info = _output_to_list(sp.check_output(COMMAND.split()))[1:]
memory_free_values = [int(x.split()[0]) for i, x in enumerate(memory_free_info)]
available_gpus = [i for i, x in enumerate(memory_free_values) if x > ACCEPTABLE_AVAILABLE_MEMORY]
if len(available_gpus) < leave_unmasked: raise ValueError('Found only %d usable GPUs in the system' % len(available_gpus))
os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(map(str, available_gpus[:leave_unmasked]))
except Exception as e:
print('"nvidia-smi" is probably not installed. GPUs are not masked', e)
Limitations: if you start multiple scripts at once it would still cause a collision, because memory is not allocated immideately when you construct a session. In case it is a problem for you, you can use a randomized version as in my original source code: mask_busy_gpus()