I am wondering why in Tensorflow version 1.5.0 and later, softmax_cross_entropy_with_logits_v2 defaults to backpropagating into both labels and logits. What are some applications/scenarios where you would want to backprop into labels?
相关问题
- batch_dot with variable batch size in Keras
- How to use Reshape keras layer with two None dimen
- CV2 Image Error: error: (-215:Assertion failed) !s
- Why keras use “call” instead of __call__?
- How to conditionally scale values in Keras Lambda
相关文章
- tensorflow 神经网络 训练集准确度远高于验证集和测试集准确度?
- Tensorflow: device CUDA:0 not supported by XLA ser
- how to flatten input in `nn.Sequential` in Pytorch
- Numpy array to TFrecord
- conditional graph in tensorflow and for loop that
- How to downgrade to cuda 10.0 in arch linux?
- Apply TensorFlow Transform to transform/scale feat
- How to use cross_val_score with random_state
I saw the github issue below asking the same question, you might want to follow it for future updates.
https://github.com/tensorflow/minigo/issues/37
I don't speak for the developers who made this decision, but I would surmise that they would do this by default because it is indeed used often, and for most application where you aren't backpropagating into the labels, the labels are a constant anyway and won't be adversely affected.
Two common uses cases for backpropagating into labels are:
There is a whole field of study around building adversarial examples that fool a neural network. Many of the approaches used to do so involve training a network, then holding the network fixed and backpropagating into the labels (original image) to tweak it (under some constraints usually) to produce a result that fools the network into misclassifying the image.
I also recommend people watch the deepviz toolkit video on youtube, you'll learn a ton about the internal representations learned by a neural network.
https://www.youtube.com/watch?v=AgkfIQ4IGaM
If you continue digging into that and find the original paper you'll find that they also backpropagate into the labels to generate images which highly activate certain filters in the network in order to understand them.