I'm using tensorflow 0.8.0 with Python v2.7. My IDE is PyCharm and my os is Linux Ubuntu 14.04
I'm noticing that the following code causes my computer to freeze and/or crash:
# you will need these files!
# https://www.kaggle.com/c/digit-recognizer/download/train.csv
# https://www.kaggle.com/c/digit-recognizer/download/test.csv
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.cm as cm
# read in the image data from the csv file
# the format is: imagelabel pixel0 pixel1 ... pixel783 (there are 42,000 rows like this)
data = pd.read_csv('../train.csv')
labels = data.iloc[:,:1].values.ravel() # shape = (42000, 1)
labels_count = np.unique(labels).shape[0] # = 10
images = data.iloc[:,1:].values # shape = (42000, 784)
images = images.astype(np.float64)
image_size = images.shape[1]
image_width = image_height = np.sqrt(image_size).astype(np.int32) # since these images are sqaure... hieght = width
# turn all the gray-pixel image-values into percentages of 255
# a 1.0 means a pixel is 100% black, and 0.0 would be a pixel that is 0% black (or white)
images = np.multiply(images, 1.0/255)
# create oneHot vectors from the label #s
oneHots = tf.one_hot(labels, labels_count, 1, 0) #shape = (42000, 10)
#split up the training data even more (into validation and train subsets)
VALIDATION_SIZE = 3167
validationImages = images[:VALIDATION_SIZE]
validationLabels = labels[:VALIDATION_SIZE]
trainImages = images[VALIDATION_SIZE:]
trainLabels = labels[VALIDATION_SIZE:]
# ------------- Building the NN -----------------
# set up our weights (or kernals?) and biases for each pixel
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(.1, shape=shape, dtype=tf.float32)
return tf.Variable(initial)
# convolution
def conv2d(x, W):
return tf.nn.conv2d(x, W, [1,1,1,1], 'SAME')
# pooling
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
# placeholder variables
# images
x = tf.placeholder('float', shape=[None, image_size])
# labels
y_ = tf.placeholder('float', shape=[None, labels_count])
# first convolutional layer
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
# turn shape(40000,784) into (40000,28,28,1)
image = tf.reshape(trainImages, [-1,image_width , image_height,1])
image = tf.cast(image, tf.float32)
# print (image.get_shape()) # =>(40000,28,28,1)
h_conv1 = tf.nn.relu(conv2d(image, W_conv1) + b_conv1)
# print (h_conv1.get_shape()) # => (40000, 28, 28, 32)
h_pool1 = max_pool_2x2(h_conv1)
# print (h_pool1.get_shape()) # => (40000, 14, 14, 32)
# second convolutional layer
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
#print (h_conv2.get_shape()) # => (40000, 14,14, 64)
h_pool2 = max_pool_2x2(h_conv2)
#print (h_pool2.get_shape()) # => (40000, 7, 7, 64)
# densely connected layer
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
# (40000, 7, 7, 64) => (40000, 3136)
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
#print (h_fc1.get_shape()) # => (40000, 1024)
# dropout
keep_prob = tf.placeholder('float')
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
print h_fc1_drop.get_shape()
#readout layer for deep neural net
W_fc2 = weight_variable([1024,labels_count])
b_fc2 = bias_variable([labels_count])
print b_fc2.get_shape()
mull= tf.matmul(h_fc1_drop, W_fc2)
print mull.get_shape()
print
mull2 = mull + b_fc2
print mull2.get_shape()
y = tf.nn.softmax(mull2)
# dropout
keep_prob = tf.placeholder('float')
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print sess.run(mull[0,2])
The lase line causes the crash:
print sess.run(mull[0,2])
This is basically one location in a very big 2d array. Something about the sess.run is causing it. I'm also getting a script issue popup... some sort of google script (think maybe it's tensorflow?). I can't copy the link because my computer is completely frozen.
Setting the session as default, and initializing your variable before running the session may solve your problem.
I suspect the problem arises because
mull[0, 2]
—despite its small apparent size—depends on a very large computation, including multiple convolutions, max-poolings, and a large matrix multiplication; and therefore either your computer becomes fully loaded for a long period of time, or it runs out of memory. (You should be able to tell which by runningtop
and checking what resources are used by thepython
process in which you are running TensorFlow.)The amount of computation is so large because your TensorFlow graph is defined in terms of the entire training dataset,
trainImages
, which contains 40000 images:Instead, it would be more efficient to define your network in terms of a
tf.placeholder()
to which you can feed individual training examples, or mini-batches of examples. See the documentation on feeding for more information. In particular, since you are only interested in the 0th row ofmull
, you only need to feed the 0th example fromtrainImages
and perform computation on it to produce the necessary values. (In your current program, the results for all other examples are also being computed, and then discarded in the final slice operator.)