Tensorflow not running on GPU

2019-03-11 06:43发布

问题:

I have aldready spent a considerable of time digging around on stack overflow and else looking for the answer, but couldn't find anything

Hi all,

I am running Tensorflow with Keras on top. I am 90% sure I installed Tensorflow GPU, is there any way to check which install I did?

I was trying to do run some CNN models from Jupyter notebook and I noticed that Keras was running the model on the CPU (checked task manager, CPU was at 100%).

I tried running this code from the tensorflow website:

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))

And this is what I got:

MatMul: (MatMul): /job:localhost/replica:0/task:0/cpu:0
2017-06-29 17:09:38.783183: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] MatMul: (MatMul)/job:localhost/replica:0/task:0/cpu:0
b: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-06-29 17:09:38.784779: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] b: (Const)/job:localhost/replica:0/task:0/cpu:0
a: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-06-29 17:09:38.786128: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] a: (Const)/job:localhost/replica:0/task:0/cpu:0
[[ 22.  28.]
 [ 49.  64.]]

Which to me shows I am running on my CPU, for some reason.

I have a GTX1050 (driver version 382.53), I installed CUDA, and Cudnn, and tensorflow installed without any problems. I installed Visual Studio 2015 as well since it was listed as a compatible version.

I remember CUDA mentioning something about an incompatible driver being installed, but if I recall correctly CUDA should have installed its own driver.

Edit: I ran theses commands to list the available devices

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

and this is what I get

[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 14922788031522107450
]

and a whole lot of warnings like this

2017-06-29 17:32:45.401429: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.

Edit 2

Tried running

pip3 install --upgrade tensorflow-gpu

and I get

Requirement already up-to-date: tensorflow-gpu in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages
Requirement already up-to-date: markdown==2.2.0 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: html5lib==0.9999999 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: werkzeug>=0.11.10 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: wheel>=0.26 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: bleach==1.5.0 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: six>=1.10.0 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: protobuf>=3.2.0 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: backports.weakref==1.0rc1 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: numpy>=1.11.0 in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: setuptools in c:\users\goofynose\appdata\local\programs\python\python35\lib\site-packages (from protobuf>=3.2.0->tensorflow-gpu)

Solved: Check comments for solution. Thanks to all who helped!

I am new to this, so any help is greatly appreciated! Thank you.

回答1:

To check which devices are available to TensorFlow you can use this and see if the GPU cards are available:

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

Edit Also, you should see this kind of logs if you use TensorFlow Cuda version :

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.*.* locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.*.*  locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.*.*  locally


回答2:

It may sound dumb, but try reboot. It helped me and some other folks in GitHub.



回答3:

I was still having trouble getting GPU support even after correctly installing tensorflow-gpu via pip. My problem was that I had installed tensorflow 1.5, and CUDA 9.1 (the default version Nvidia directs you to), whereas the precompiled tensorflow 1.5 works with CUDA versions <= 9.0. Here is download page on nvidia's site to get the correct CUDA 9.0:

https://developer.nvidia.com/cuda-90-download-archive

Also make sure to update your cuDNN to a version compatible with CUDA 9.0 https://developer.nvidia.com/cudnn https://developer.nvidia.com/rdp/cudnn-download