可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I want to know what the tensorflow function sparse_softmax_cross_entropy_with_logits mathematically is exactly doing. But I can't find the origin of the coding. Can you help me?

回答1:

sparse_softmax_cross_entropy_with_logits is equivalent to a numerically stable version of the following:

-1. * tf.gather(tf.log(tf.nn.softmax(logits)), target)

or, in more "readable" numpy-code:

-1. * np.log(softmax(logits))[target]

where softmax(x) = np.exp(x)/np.sum(np.exp(x)).

That is, it computes the softmax of the provided logits, takes the log thereof to retrieve the log-probabilities, and slices the log-probabilities to retrieve the log-probability of the target.

However, it does so in a numerically stable way (a couple of things can go wrong here) by adding small values to some of the operations. This means that computing the above - verbose - version will only approximately result in the same values as nn.sparse_softmax_cross_entropy_with_logits (running some tests showed that the difference is consistently smaller than 2e-6).

回答2:

The most important part of the implementation is here starting at line 132.

This functor is called by the kernel implementation.

It uses a not-very-well-documented feature of Eigen called generators that allow writing fairly flexible code and have it compile both for CPU and by nvcc for GPU.

回答3:

In the head version (as of today), you can find the function in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_ops.py line #424.

The comment says:

Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both.