可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I work in an environment in which computational resources are shared, i.e., we have a few server machines equipped with a few Nvidia Titan X GPUs each.

For small to moderate size models, the 12GB of the Titan X are usually enough for 2-3 people to run training concurrently on the same GPU. If the models are small enough that a single model does not take full advantage of all the computational units of the Titan X, this can actually result in a speedup compared with running one training process after the other. Even in cases where the concurrent access to the GPU does slow down the individual training time, it is still nice to have the flexibility of having several users running things on the GPUs at once.

The problem with TensorFlow is that, by default, it allocates the full amount of available memory on the GPU when it is launched. Even for a small 2-layer Neural Network, I see that the 12 GB of the Titan X are used up.

Is there a way to make TensorFlow only allocate, say, 4GB of GPU memory, if one knows that that amount is enough for a given model?

回答1:

You can set the fraction of GPU memory to be allocated when you construct a tf.Session by passing a tf.GPUOptions as part of the optional config argument:

# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

The per_process_gpu_memory_fraction acts as a hard upper bound on the amount of GPU memory that will be used by the process on each GPU on the same machine. Currently, this fraction is applied uniformly to all of the GPUs on the same machine; there is no way to set this on a per-GPU basis.

回答2:

config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

https://github.com/tensorflow/tensorflow/issues/1578

回答3:

Here is an excerpt from the Book Deep Learning with TensorFlow

In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as it is needed by the process. TensorFlow provides two configuration options on the session to control this. The first is the allow_growth option, which attempts to allocate only as much GPU memory based on runtime allocations, it starts out allocating very little memory, and as sessions get run and more GPU memory is needed, we extend the GPU memory region needed by the TensorFlow process.

1) Allow growth: (more flexible)

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)

The second method is per_process_gpu_memory_fraction option, which determines the fraction of the overall amount of memory that each visible GPU should be allocated. Note: No release of memory needed, it can even worsen memory fragmentation when done.

2) Allocate fixed memory:

To only allocate 40% of the total memory of each GPU by:

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)

Note: That\'s only useful though if you truly want to bind the amount of GPU memory available on the TensorFlow process.

回答4:

All the answers above assume execution with a sess.run() call, which is becoming the exception rather than the rule in recent versions of TensorFlow.

When using the tf.Estimator framework (TensorFlow 1.4 and above) the way to pass the fraction along to the implicitly created MonitoredTrainingSession is,

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
trainingConfig = tf.estimator.RunConfig(session_config=conf, ...)
tf.estimator.Estimator(model_fn=..., 
                       config=trainingConfig)

Similarly in Eager mode (TensorFlow 1.5 and above),

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
tfe.enable_eager_execution(config=conf)

Edit: 11-04-2018 As an example, if you are to use tf.contrib.gan.train, then you can use something similar to bellow:

tf.contrib.gan.gan_train(........, config=conf)

回答5:

Shameless plug: If you install the GPU supported Tensorflow, the session will first allocate all GPU whether you set it to use only CPU or GPU. I may add my tip that even you set the graph to use CPU only you should set the same configuration(as answered above:) ) to prevent the unwanted GPU occupation.

And in interactive interface like IPython you should also set that configure, otherwise it will allocate all memory and left almost none for others. This is sometimes hard to notice.

回答6:

i tried to train unet on voc data set but because of huge image size, memory finishes. i tried all the above tips, even tried with batch size==1, yet to no improvement. sometimes TensorFlow version also causes the memory issues. try by using

pip install tensorflow-gpu==1.8.0

How to prevent tensorflow from allocating the tota

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

回答6:

收藏的人(0)

How to prevent tensorflow from allocating the tota

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

回答6:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮