ML Engine Experiment eval tf.summary.scalar not di

I am trying to output some summary scalars in an ML engine experiment at both train and eval time. tf.summary.scalar('loss', loss) is correctly outputting the summary scalars for both training and evaluation on the same plot in tensorboard. However, I am also trying to output other metrics at both train and eval time and they are only outputting at train time. The code immediately follows tf.summary.scalar('loss', loss) but does not appear to work. For example, the code as follows is only outputting for TRAIN, but not EVAL. The only difference is that these are using custom accuracy functions, but they are working for TRAIN

if mode in (Modes.TRAIN, Modes.EVAL):
    loss = tf.contrib.legacy_seq2seq.sequence_loss(logits, outputs, weights)
    tf.summary.scalar('loss', loss)

    sequence_accuracy = sequence_accuracy(targets, predictions,weights)
    tf.summary.scalar('sequence_accuracy', sequence_accuracy)

Does it make any sense why loss would plot in tensorboard for both TRAIN & EVAL, while sequence_accuracy would only plot for TRAIN?

Could this behavior somehow be related to the warning I received "Found more than one metagraph event per run. Overwriting the metagraph with the newest event."?

标签： python tensorflow tensorboard google-cloud-ml-engine

1条回答

戒情不戒烟

2楼-- · 2019-06-14 05:48

Because the summary node in the graph is just a node. It still needs to be evaluated (outputting a protobuf string), and that string still needs to be written to a file. It's not evaluated in training mode because it's not upstream of the train_op in your graph, and even if it were evaluated, it wouldn't be written to a file unless you specified a tf.train.SummarySaverHook as one of you training_chief_hooks in your EstimatorSpec. Because the Estimator class doesn't assume you want any extra evaluation during training, normally evaluation is only done during the EVAL phase, and you just increase min_eval_frequency or checkpoint_frequency to get more evaluation datapoints.

If you really really want to log a summary during training here's how you'd do it:

def model_fn(mode, features, labels, params):
  ...
  if mode == Modes.TRAIN:
    # loss is already written out during training, don't duplicate the summary op
    loss = tf.contrib.legacy_seq2seq.sequence_loss(logits, outputs, weights)
    sequence_accuracy = sequence_accuracy(targets, predictions,weights)
    seq_sum_op = tf.summary.scalar('sequence_accuracy', sequence_accuracy)
    with tf.control_depencencies([seq_sum_op]):
       train_op = optimizer.minimize(loss)

    return tf.estimator.EstimatorSpec(
      loss=loss,
      mode=mode,
      train_op=train_op,
      training_chief_hooks=[tf.train.SummarySaverHook(
          save_steps=100,
          output_dir='./summaries',
          summary_op=seq_sum_op
      )]
    )

But it's better to just increase your eval frequency and make an eval_metric_ops for accuracy with tf.metrics.streaming_accuracy

0人赞添加讨论(0) 举报

ML Engine Experiment eval tf.summary.scalar not di

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间