Different loss values for test_on_batch and train_

While trying to train a GAN for image generation I ran into a problem which I cannot explain.

When training the generator, the loss which is returned by train_on_batch after just 2 or 3 iterations directly drops to zero. After investigating I realized some strange behavior of the train_on_batch method:

When I check the following:

noise = np.random.uniform(-1.0, 1.0, size=[batch_size, gen_noise_length])
predictions = GAN.stackedModel.predict(noise)

This returns values all close to zero as I would expect since the generator is not trained yet.

However:

y = np.ones([batch_size, 1])
noise = np.random.uniform(-1.0, 1.0, size=[batch_size, gen_noise_length])
loss = GAN.stackedModel.train_on_batch(noise, y)

here the loss is almost zero even though my expected targets are obvious ones. When I run:

y = np.ones([batch_size, 1])
noise = np.random.uniform(-1.0, 1.0, size=[batch_size, gen_noise_length])
loss = GAN.stackedModel.test_on_batch(noise, y)

the returned loss is high as I would expect.

What is going on with the train_on_batch method? I'm really clueless here...

edit

My loss is binary-crossentropy and I build the model like:

def createStackedModel(self):
    # Build stacked GAN model
    gan_in = Input([self.noise_length])
    H = self.genModel(gan_in)
    gan_V = self.disModel(H)
    GAN = Model(gan_in, gan_V)
    opt = RMSprop(lr=0.0001, decay=3e-8)
    GAN.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
    return GAN

edit 2

The generator is constructed by stacking some of those blocks each containing a BatchNormalization:

    self.G.add(UpSampling2D())
    self.G.add(Conv2DTranspose(int(depth/8), 5, padding='same'))
    self.G.add(BatchNormalization(momentum=0.5))
    self.G.add(Activation('relu'))

edit 3

I loaded my code to https://gitlab.com/benjamingraf24/DCGAN/ Apparently the problem results from the way how I build the GAN network. So in GANBuilder.py there must be something wrong. However, I cant find it...

标签： python tensorflow machine-learning neural-network keras

1条回答

仙女界的扛把子

2楼-- · 2019-08-02 15:49

BatchNormalization layers behave differently during training and testing phase.

During training phase they will use the current batch mean and variance of the activations to normalize.

However, during testing phase they use the moving mean and moving variance that they collected during training. Without enough previous training these collected values can be far from the actual batch statistics, resulting in significant loss value differences.

Refer to the Keras documentation for BatchNormalization. The momentum argument is used to define how fast the moving mean and moving average will adapt to freshly collected values of batches during training.

0人赞添加讨论(0) 举报

Different loss values for test_on_batch and train_

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间