It seems that each Tensorflow session I open and close consumes 1280 bytes from the GPU memory, which are not released until the python kernel is terminated.
To reproduce, save the following python script as memory_test.py
:
import tensorflow as tf
import sys
n_Iterations=int(sys.argv[1])
def open_and_close_session():
with tf.Session() as sess:
pass
for _ in range(n_Iterations):
open_and_close_session()
with tf.Session() as sess:
print("bytes used=",sess.run(tf.contrib.memory_stats.BytesInUse()))
Then run it from command line with different number of iterations:
python memory_test.py 0
yieldsbytes used= 1280
python memory_test.py 1
yieldsbytes used= 2560
.python memory_test.py 10
yieldsbytes used= 14080
.python memory_test.py 100
yieldsbytes used= 129280
.python memory_test.py 1000
yieldsbytes used= 1281280
.
The math is easy - each session opened and closed leaks 1280 bytes. I tested this script on two different ubuntu 17.10 workstations with tensorflow-gpu 1.6 and 1.7 and different NVIDIA GPUs.
Did I miss some explicit garbage collection or is it a Tensorflow bug?
Edit: Note that unlike the case described in this question, I add nothing to the default global graph within the loop, unless the tf.Session() objects themselves 'count'. If this is the case, how can one delete them? tf.reset_default_graph()
or using with tf.Graph().as_default(), tf.Session() as sess:
doesn't help.
Turning my comment into an answer:
I can reproduce this behavior. I guess you should create an Issue on the GitHub-Issue-Tracker. TF uses it own Allocator-mechanism and the documentation of the session object clearly states that
close()
Which is apparently not the case here. However, even the 1281280 bytes could be potentially reused from the memory pool in a consecutive session.
So the answer is: It seems to be a bug (even in a recent '1.8.0-rc0' Version of TensorFlow.) -- either in
close()
or in thememory_stats
Implementation.