Where is the origin coding of sparse_softmax_cross

2019-07-28 18:57发布

I want to know what the tensorflow function sparse_softmax_cross_entropy_with_logits mathematically is exactly doing. But I can't find the origin of the coding. Can you help me?

3条回答
爷、活的狠高调
2楼-- · 2019-07-28 19:35

The most important part of the implementation is here starting at line 132.

This functor is called by the kernel implementation.

It uses a not-very-well-documented feature of Eigen called generators that allow writing fairly flexible code and have it compile both for CPU and by nvcc for GPU.

查看更多
时光不老,我们不散
3楼-- · 2019-07-28 19:41

In the head version (as of today), you can find the function in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_ops.py line #424.

The comment says:

Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both.

查看更多
Emotional °昔
4楼-- · 2019-07-28 19:54

sparse_softmax_cross_entropy_with_logits is equivalent to a numerically stable version of the following:

-1. * tf.gather(tf.log(tf.nn.softmax(logits)), target)

or, in more "readable" numpy-code:

-1. * np.log(softmax(logits))[target]

where softmax(x) = np.exp(x)/np.sum(np.exp(x)).

That is, it computes the softmax of the provided logits, takes the log thereof to retrieve the log-probabilities, and slices the log-probabilities to retrieve the log-probability of the target.

However, it does so in a numerically stable way (a couple of things can go wrong here) by adding small values to some of the operations. This means that computing the above - verbose - version will only approximately result in the same values as nn.sparse_softmax_cross_entropy_with_logits (running some tests showed that the difference is consistently smaller than 2e-6).

查看更多
登录 后发表回答