My raw_data is PTB dataset. I am generating batches by the following code.
def generate_batches(raw_data, batch_size, unrollings):
global data_index
data_len = len(raw_data)
num_batches = data_len // batch_size
inputs = []
labels = []
print (num_batches, data_len, batch_size)
for j in xrange(unrollings) :
inputs.append([])
labels.append([])
for i in xrange(batch_size) :
inputs[j].append(raw_data[i + data_index])
labels[j].append(raw_data[i + data_index + 1])
data_index = (data_index + batch_size) % len(raw_data)
return inputs, labels
In session run, the same batches generated are fed in feed_dict as in the following code.
for step in xrange(num_steps) :
batch_inputs, batch_labels = generate_batches(train_dataset, batch_size, unrollings=5)
feed_dict = dict()
for i in range(unrollings):
feed_dict = {train_inputs : batch_inputs, train_labels : batch_labels}
_, l, predictions, lr = session.run([optimizer, loss, train_prediction, learning_rate], feed_dict=feed_dict)
The training input and labels are as follows :
for _ in range(unrollings) :
train_data.append(tf.placeholder(shape=[batch_size], dtype=tf.int32))
train_label.append(tf.placeholder(shape=[batch_size, 1], dtype=tf.float32))
train_inputs = train_data[:unrollings]
train_labels = train_label[:unrollings]
First, I got the error TypeError: unhashable type: 'list'
to which I converted the list of batch_input to tuple using tuple(batch_input[i])
which is explained clearly in Python dictionary : TypeError: unhashable type: 'list'.
Resolved : Then I get this error TypeError: unhashable type: 'numpy.ndarray'
.
.
I think you are misunderstanding how
feed_dict
works. But first of all, python dict doesn't accept any instances of unhashable class as key. Both list and numpy.ndarray are not allowed to use as dict key (even if you wrap it with a tuple). I found list post explains about dict key.How feed_dict works
In your graph, there should be placeholders created as symbolic tensors. Assume that your raw data is 2D: (num_samples, num_features), with the first dimension corresponds to the size of samples and the second dimension corresponds to the num of features. Assume the labels are one-hot encoded and have num_classes in total.
Then in your session when setting up feed_dict, you use those symbolic placeholder tensor as key and the sampled batch_data as value.