While trying to train a GAN for image generation I ran into a problem which I cannot explain.
When training the generator, the loss which is returned by train_on_batch
after just 2 or 3 iterations directly drops to zero. After investigating I realized some strange behavior of the train_on_batch
method:
When I check the following:
noise = np.random.uniform(-1.0, 1.0, size=[batch_size, gen_noise_length])
predictions = GAN.stackedModel.predict(noise)
This returns values all close to zero as I would expect since the generator is not trained yet.
However:
y = np.ones([batch_size, 1])
noise = np.random.uniform(-1.0, 1.0, size=[batch_size, gen_noise_length])
loss = GAN.stackedModel.train_on_batch(noise, y)
here the loss is almost zero even though my expected targets are obvious ones. When I run:
y = np.ones([batch_size, 1])
noise = np.random.uniform(-1.0, 1.0, size=[batch_size, gen_noise_length])
loss = GAN.stackedModel.test_on_batch(noise, y)
the returned loss is high as I would expect.
What is going on with the train_on_batch
method? I'm really clueless here...
edit
My loss is binary-crossentropy and I build the model like:
def createStackedModel(self):
# Build stacked GAN model
gan_in = Input([self.noise_length])
H = self.genModel(gan_in)
gan_V = self.disModel(H)
GAN = Model(gan_in, gan_V)
opt = RMSprop(lr=0.0001, decay=3e-8)
GAN.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
return GAN
edit 2
The generator is constructed by stacking some of those blocks each containing a BatchNormalization:
self.G.add(UpSampling2D())
self.G.add(Conv2DTranspose(int(depth/8), 5, padding='same'))
self.G.add(BatchNormalization(momentum=0.5))
self.G.add(Activation('relu'))
edit 3
I loaded my code to https://gitlab.com/benjamingraf24/DCGAN/ Apparently the problem results from the way how I build the GAN network. So in GANBuilder.py there must be something wrong. However, I cant find it...