I have my entire dataset in memory as list of tuples where each tuple corresponds to a batch of fixed size 'N' . i.e
(x[i],label[i],length[i])
- x[i]: numpy array of shape [N,W,F]; here there are N examples, with W timestep each; all timesteps have fixed number of features F
- label[i] : class: shape [N,] one for each example in batch
- length[i] : length (number of timesteps ) in data : shape [N,] : this is number of timesteps (W) for each example in batch
Main problem : Across the batches W varies .
I was looking at the following examples and documentation for Dataset API but could not understand how to create a DataSet object for my case. API's like Dataset.from_tensor_slices and Dataset.from_tensor don't seem to be working (throwing broadcasting errors) as they require tensors to be of same shape i,e W across batches to be the same. Is there any way I can do without having to pad my batches (using DataSet.padded_batch) ?