Keras LSTM training data format

2020-06-16 05:28发布

I am trying to use LSTM neural network (using Keras) to predict opponent's next move in the game Rock-Paper-Scissor.

I have encode the inputs as Rock: [1 0 0], Paper: [0 1 0], Scissor: [0 0 1]. Now I want to train the neural network but I am a bit confused of the data structure of my training data.

I have stored an opponent's game history in a .csv file with the following structure:

1,0,0
0,1,0
0,1,0
0,0,1
1,0,0
0,1,0
0,1,0
0,0,1
1,0,0
0,0,1

And I am trying to use every 5th data as my training label, and the previous 4 data as the training input. In another word, at each time step, a vector with dimension 3 is sent to the network, and we have 4 time steps.

For example, the following is the input data

1,0,0
0,1,0
0,1,0
0,0,1

And the fifth one is the training label

1,0,0

My question is what type of data format does Keras' LSTM network accept? What would be an optimum way to rearrange my data for this purpose? My incomplete code is attached as following if it helps:

#usr/bin/python
from __future__ import print_function

from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.optimizers import Adam

output_dim = 3
input_dim = 3
input_length = 4
batch_size = 20   #use all the data to train in one iteration


#each input has such strcture
#Rock: [1 0 0], Paper: [0 1 0], Scissor: [0 0 1]
#4 inputs (vectors) are sent to the LSTM net and output 1 vector as the prediction

#incomplete function
def read_data():
    raw_training = np.genfromtxt('training_data.csv',delimiter=',')




    print(raw_training)

def createNet(summary=False):
    print("Start Initialzing Neural Network!")
    model = Sequential()
    model.add(LSTM(4,input_dim=input_dim,input_length=input_length,
            return_sequences=True,activation='softmax'))
    model.add(Dropout(0.1))
    model.add(LSTM(4,
            return_sequences=True,activation='softmax'))
    model.add(Dropout(0.1))
    model.add(Dense(3,activation='softmax'))
    model.add(Dropout(0.1))
    model.add(Dense(3,activation='softmax'))
    model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
    if summary:
        print(model.summary())
    return model

if __name__=='__main__':
    createNet(True)

1条回答
倾城 Initia
2楼-- · 2020-06-16 06:04

The input format for the LSTM should have a shape (sequence_length, input_dim). So in your case, numpy arrays of shape (4,3) should do it.

What you will feed to the model will then be a numpy array of shape (number_of_train_examples, sequence_length, input_dim). In other words, you will feed number_of_train_examples tables of shape (4,3). Build a list of :

1,0,0
0,1,0
0,1,0
0,0,1

and then do np.array(list_of_train_example).

However, I don't understand why you return the whole sequence for the second LSTM? It will output you something with the shape (4,4), the Dense layer will probably fail on that. Return sequence means that you will return the whole sequence, so every hidden output at each step of LSTM. I would set this to False for the second LSTM to only get a "summary" vector of shape (4,) that your Dense layer can read. Anyway, even for the first LSTM it means that with an input of shape (4,3), you output something which has shape (4,4), so you will have more parameters than input data for this layer... Can't be really good.

Regarding the activations, I would also use softmax but only on the last layer, softmax is used to get probabilities as output of the layer. It doesn't make really sense to use a softmax out of LSTM's and the Dense before the last. Go for some other non linearity like "sigmoid" or "tanh".

This is what I would do model-wise

def createNet(summary=False):
    print("Start Initialzing Neural Network!")
    model = Sequential()
    model.add(LSTM(4,input_dim=input_dim,input_length=input_length,
            return_sequences=True,activation='tanh'))
    model.add(Dropout(0.1))
    # output shape : (4,4)
    model.add(LSTM(4,
            return_sequences=False,activation='tanh'))
    model.add(Dropout(0.1))
    # output shape : (4,)
    model.add(Dense(3,activation='tanh'))
    model.add(Dropout(0.1))
    # output shape : (3,)
    model.add(Dense(3,activation='softmax'))
    # output shape : (3,)
    model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
    if summary:
        print(model.summary())
    return model
查看更多
登录 后发表回答