I have built a model to train a convolutional autoencoder in TensorFlow. I followed the instructions on Reading Data from the TF documentation to read in my own images of size 233 x 233 x 3. Here is my convert_to() function adapted from those instructions:
def convert_to(images, name):
"""Converts a dataset to tfrecords."""
num_examples = images.shape[0]
rows = images.shape[1]
cols = images.shape[2]
depth = images.shape[3]
filename = os.path.join(FLAGS.tmp_dir, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
print(images[index].size)
image_raw = images[index].tostring()
print(len(image_raw))
example = tf.train.Example(features=tf.train.Features(feature={
'height': _int64_feature(rows),
'width': _int64_feature(cols),
'depth': _int64_feature(depth),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
writer.close()
When I print the size of the image at the start of the for loop, the size is 162867, but when I print after the .tostring() line, the size is 1302936. This causes problems down the road because the model thinks my input is 8x what it should be. Is it better to change the 'image_raw' entry in the Example to _int64_feature(image_raw) or to change the way I convert it to a string?
Alternatively, the problem could be in my read_and_decode() function, e.g. the string is not properly being decoded or the example not being parsed...?
def read_and_decode(self, filename_queue):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
features={
'height': tf.FixedLenFeature([], tf.int64),
'width': tf.FixedLenFeature([], tf.int64),
'depth': tf.FixedLenFeature([], tf.int64),
'image_raw': tf.FixedLenFeature([], tf.string)
})
# Convert from a scalar string tensor to a uint8 tensor
image = tf.decode_raw(features['image_raw'], tf.uint8)
# Reshape into a 233 x 233 x 3 image and apply distortions
image = tf.reshape(image, (self.input_rows, self.input_cols, self.num_filters))
image = data_sets.normalize(image)
image = data_sets.apply_augmentation(image)
return image
Thank you!