I want to use Image augmentation in Keras. My current code looks like this:
# define image augmentations
train_datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
zca_whitening=True)
# generate image batches from directory
train_datagen.flow_from_directory(train_dir)
When I run a model with this, I get the following error:
"ImageDataGenerator specifies `featurewise_std_normalization`, but it hasn't been fit on any training data."
But I didn't find clear information about how to use train_dataget.fit()
together with flow_from_directory
.
Thank you for your help. Mario
You are right, the docs are not very enlightening on this ...
What you need is actually a 4-step process:
flow_from_directory()
fit_generator()
Here is the necessary code for a hypothetical image classification case:
Clearly, there are several parameters to be defined (
train_data_dir
,nb_train_samples
etc), but hopefully you get the idea.If you need to also use a
validation_generator
, as in my example, this should be defined the same way as yourtrain_generator
.UPDATE (after comment)
Step 2 needs some discussion; here,
x_train
are the actual data which, ideally, should fit into the main memory. Also (documentation), this step isHowever, there are many real-world cases where the requirement that all the training data fit into memory is clearly unrealistic. How you center/normalize/white data in such cases is a (huge) sub-field in itself, and arguably the main reason for the existence of big data processing frameworks such as Spark.
So, what to do in practice here? Well, the next logical action in such a case is to sample your data; indeed, this is exactly what the community advises - here is Keras creator Francois Chollet on Working with large datasets like Imagenet:
And another quote from an ongoing open discussion about extending
ImageDataGenerator
(emphasis added):Hope this helps...