I want to know what the tensorflow function sparse_softmax_cross_entropy_with_logits mathematically is exactly doing. But I can't find the origin of the coding. Can you help me?
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- batch_dot with variable batch size in Keras
- How to get the background from multiple images by
- Evil ctypes hack in python
The most important part of the implementation is here starting at line 132.
This functor is called by the kernel implementation.
It uses a not-very-well-documented feature of Eigen called generators that allow writing fairly flexible code and have it compile both for CPU and by nvcc for GPU.
In the head version (as of today), you can find the function in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_ops.py line #424.
The comment says:
sparse_softmax_cross_entropy_with_logits is equivalent to a numerically stable version of the following:
-1. * tf.gather(tf.log(tf.nn.softmax(logits)), target)
or, in more "readable" numpy-code:
-1. * np.log(softmax(logits))[target]
where
softmax(x) = np.exp(x)/np.sum(np.exp(x))
.That is, it computes the softmax of the provided logits, takes the log thereof to retrieve the log-probabilities, and slices the log-probabilities to retrieve the log-probability of the target.
However, it does so in a numerically stable way (a couple of things can go wrong here) by adding small values to some of the operations. This means that computing the above - verbose - version will only approximately result in the same values as
nn.sparse_softmax_cross_entropy_with_logits
(running some tests showed that the difference is consistently smaller than 2e-6).