Tensorflow: Failed to create session

I get an error when I run my code, the error is:

tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

Here is my code:

# -*- coding: utf-8 -*-
import ...
import ...

checkpoint='/home/vrview/tensorflow/example/char/data/model/'
MODEL_SAVE_PATH = "/home/vrview/tensorflow/example/char/data/model/"

def getAllImages(folder):
    assert os.path.exists(folder)
    assert os.path.isdir(folder)
    imageList = os.listdir(folder)
    imageList = [os.path.join(folder,item) for item in imageList ]
    num=len(imageList)
    return imageList,num

def get_labei():
    img_dir, num = getAllImages(r"/home/vrview/tensorflow/example/char/data/model/file/")
    for i in range(num):
        image = Image.open(img_dir[i])
        image = image.resize([56, 56])
        image = np.array(image)
        image_array = image

        with tf.Graph().as_default():
            image = tf.cast(image_array, tf.float32)
            image_1 = tf.image.per_image_standardization(image)
            image_2 = tf.reshape(image_1, [1, 56, 56, 3])

            logit = color_inference.inference(image_2)
            y = tf.nn.softmax(logit)
            x = tf.placeholder(tf.float32, shape=[56, 56, 3])

            saver = tf.train.Saver()
            with tf.Session() as sess:
              ckpt = tf.train.get_checkpoint_state(MODEL_SAVE_PATH)
              if ckpt and ckpt.model_checkpoint_path:
                   global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
                   saver.restore(sess, ckpt.model_checkpoint_path)
                   print('Loading success, global_step is %s' % global_step)
                   prediction = sess.run(y)
                   max_index = np.argmax(prediction)
              else:
                   print('No checkpoint file found')

        path='/home/vrview/tensorflow/example/char/data/move_file/'+str(max_index)
        isExists = os.path.exists(path)
        if not isExists :
            os.makedirs(path)
        shutil.copyfile(img_dir[i], path)

def main(argv=None):
    get_labei()

if __name__ == '__main__':
    tf.app.run()

And here is my error:

Traceback (most recent call last):
  File "/home/vrview/tensorflow/example/char/data/model/color_class_2.py", line 61, in <module>
    tf.app.run()
  File "/home/vrview/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/home/vrview/tensorflow/example/char/data/model/color_class_2.py", line 58, in main
    get_labei()
  File "/home/vrview/tensorflow/example/char/data/model/color_class_2.py", line 40, in get_labei
    with tf.Session() as sess:
  File "/home/vrview/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1187, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/home/vrview/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 552, in __init__
    self._session = tf_session.TF_NewDeprecatedSession(opts, status)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/vrview/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

标签： python tensorflow

7条回答

够拽才男人

2楼-- · 2020-07-13 01:40

maybe out of GPU memory? Try running with

export CUDA_VISIBLE_DEVICES=''

Also please provide details about what platform you are using (operating system, architecture). Also include your TensorFlow version.

Were you able to create a simple session from python console. Something like this:

import tensorflow as tf
hello = tf.constant('hi,tensorflow')
sess = tf.Session()

0人赞添加讨论(0) 举报

Ridiculous、

3楼-- · 2020-07-13 01:43

In my case it helped to revert back to tensorflow 1.9.0 as was suggested here (Anaconda had installed version 1.10.0). It automatically installs the correct version of Cuda (9.0 instead of 9.2 out of my head). Downgrading is simple in Anaconda:

conda install tensorflow=1.9.0

That worked for me. This setup works with Keras 2.2.2.

0人赞添加讨论(0) 举报

孤傲高冷的网名

4楼-- · 2020-07-13 01:46

In the case I just solved, it was updating the GPU driver to the latest and installing the cuda toolkit. First, the ppa was added and GPU driver installed:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390

After adding the ppa, it showed options for driver versions, and 390 was the latest 'stable' version that was shown.

Then install the cuda toolkit:

sudo apt install nvidia-cuda-toolkit

Then reboot:

sudo reboot

It updated the drivers to a newer version than the 390 originally installed in the first step (it was 410; this was a p2.xlarge instance on AWS).

0人赞添加讨论(0) 举报

一纸荒年 Trace。

5楼-- · 2020-07-13 01:51

Happened to me when I had a separate Tensorflow session running in another terminal. Closing that terminal made it work.

0人赞添加讨论(0) 举报

男人必须洒脱

6楼-- · 2020-07-13 01:52

After you execute

export CUDA_VISIBLE_DEVICES=''

your tensorflow may not use GPU. It may start training the model using CPU only.

You can find a better solution here. This doesn't require any restart, and you can apply it in server.

0人赞添加讨论(0) 举报

Deceive 欺骗

7楼-- · 2020-07-13 01:53

Are you using GPU? If yes, maybe it's just simply out of GPU Memory due to the previous process failed to be killed.

This ticket helps me identify the problem: https://github.com/tensorflow/tensorflow/issues/9549

To see your GPU status: in terminal, nvidia-smi -l 2 to update your gpu stat every 2 seconds

This post shows you how to kill the process that currently taking all the memory of your GPU: https://www.quora.com/How-do-I-kill-all-the-computer-processes-shown-in-nvidia-smi

0人赞添加讨论(0) 举报

1 2 下一页

Tensorflow: Failed to create session

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间