I have been given some data of this format and the following details:
person1, day1, feature1, feature2, ..., featureN, label
person1, day2, feature1, feature2, ..., featureN, label
...
person1, dayN, feature1, feature2, ..., featureN, label
person2, day1, feature1, feature2, ..., featureN, label
person2, day2, feature1, feature2, ..., featureN, label
...
person2, dayN, feature1, feature2, ..., featureN, label
...
- there is always the same number of features but each feature might be a 0 representing nothing
- there is a varying amount of days available for each person, e.g. person1 has 20 days of data, person2 has 50
The goal is to predict the label of the person the following day, so the label for dayN+1, either on a per-person basis, or overall (per-person makes more sense to me). I can freely reformat the data (it is not large). Based on the above after some reading I thought a dynamic RNN (LSTM) could work best:
- recurrent neural network: because the next day relies on the previous day
- lstm: because the model builds up with each day
- dynamic: because not all features are present each day
If it does not make sense for the data I have, please stop me here. The question is then:
How to give/format this data for tensorflow/tflearn?
I have looked at this example using tflearn but I do not understand its input format so that I can 'mirror' it to mine. Similarly, have found this post on a very similar question yet it seems like the samples the poster has are not related between each-other as they are in mine. My experience with tensorflow is limited to its get started page.