saving Keras augmented data as a numpy array

2020-07-30 02:53发布

问题:

using keras ImageDataGenerator, we can save augmented images as png or jpg :

    for X_batch, y_batch in datagen.flow(train_data, train_labels, batch_size=batch_size,\
                save_to_dir='images', save_prefix='aug', save_format='png'):

I have a dataset of the shape (1600, 4, 100,100), which means 1600 images with 4 channels of 100x100 pixels. How can I save the augmented data as numpy array of shape (N,4,100,100) instead of individual images?

回答1:

Since you know the number of samples = 1600, you can stop datagen.flow() as long as this number is reached.

augmented_data = []
num_augmented = 0
for X_batch, y_batch in datagen.flow(train_data, train_labels, batch_size=batch_size, shuffle=False):
    augmented_data.append(X_batch)
    num_augmented += batch_size
    if num_augmented == train_data.shape[0]:
        break
augmented_data = np.concatenate(augmented_data)
np.save(...)

Note that you should set batch_size properly (e.g. batch_size=10) so that no extra augmented images are generated.