-->

RNN/LSTM deep learning model?

2020-03-31 09:49发布

问题:

I am trying to build an RNN/LSTM model for binary classification 0 or 1

a sample of my dataset (patient number, time in mill/sec., normalization of X Y and Z, kurtosis, skewness, pitch, roll and yaw, label) respectively.

1,15,-0.248010047716,0.00378335508419,-0.0152548459993,-86.3738760481,0.872322164158,-3.51314800063,0

1,31,-0.248010047716,0.00378335508419,-0.0152548459993,-86.3738760481,0.872322164158,-3.51314800063,0

1,46,-0.267422664673,0.0051143782875,-0.0191247001961,-85.7662354031,1.0928406847,-4.08015176908,0

1,62,-0.267422664673,0.0051143782875,-0.0191247001961,-85.7662354031,1.0928406847,-4.08015176908,0 

what I have tried.

import numpy as np
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Bidirectional
from keras.preprocessing import sequence
# fix random seed for reproducibility
np.random.seed(7)

train = np.loadtxt("featwithsignalsTRAIN.txt", delimiter=",")
test = np.loadtxt("featwithsignalsTEST.txt", delimiter=",")

x_train = train[:,[2,3,4,5,6,7]]
x_test = test[:,[2,3,4,5,6,7]]
y_train = train[:,8]
y_test = test[:,8]

# create the model
model = Sequential()
model.add(LSTM(20, dropout=0.2, input_dim=6))
model.add(Dense(4, activation = 'sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs = 2)

but it gives me the following error

Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (1415684, 6)

回答1:

The LSTM layer takes a 3 dimensional input, corresponding to (batch_size, timesteps, features). In your case you have only a 2 dimensional input, which is (batch_size, features).

The LSTM layer is adapted to sequences formats (sentences, stocks prices ...). You need to reshape your data so that it can be used this way. More specificaly, you need to reshape your data to have one line per patient (Or you can choose to have multiple sequences per patient, but let's say we want one line per patient for now), and each line needs to contain multiple arrays, each array corresponding to an observation of your patient.