Tensorflow: what is the exact formula applied in `tf.nn.sparse_softmax_cross_entropy_with_logits`?

Question

Tensorflow: what is the exact formula applied in `tf.nn.sparse_softmax_cross_entropy_with_logits`?

I tried to manually recalculate the outputs of this function, so I created a minimal example:

logits = tf.pack(np.array([[[[0,1,2]]]],dtype=np.float32)) # img of shape (1, 1, 1, 3)
labels = tf.pack(np.array([[[1]]],dtype=np.int32)) # gt of shape (1, 1, 1)

softmaxCrossEntropie = tf.nn.sparse_softmax_cross_entropy_with_logits(logits,labels)
softmaxCrossEntropie.eval() # --> output is [1.41]

Now, according to my own calculations, I only get [1.23] When manually calculating, I just apply softmax

and cross-entropy:

where q(x) = sigma(x_j) or (1-sigma(x_j))

depends on whether j is the correct true earth class or not, and p(x) = labels

which are then unidirectional

I'm not sure where this difference comes from. I can't imagine some epsilon making such a big difference. Does anyone know where I can look for what is the exact formula used by tensorflow? Is the source code for this exact piece available?
I could only find nn_ops.py

, but it only uses another function called gen_nn_ops._sparse_softmax_cross_entropy_with_logits

which I could not find on github ...

+3

tensorflow

mcExchange Apr 13 17 at 13:52

source to share

1 answer

Dmitriy Danevskiy · Accepted Answer · 2017-04-13T15:14:07+0000

Okay, usually p(x)

in the cross-entropy equation, the true distribution is, and q(x)

is the distribution obtained from softmax. So, if p(x)

is hot (and it is, otherwise the sparse cross-entropy cannot be applied), the cross-entropy is just a negative log for the probability of the true category.

In your example softmax(logits)

, it is a vector with values [0.09003057, 0.24472847, 0.66524096]

, so the loss -log(0.24472847) = 1.4076059

, which is what you got as a result.

Tensorflow: what is the exact formula applied in `tf.nn.sparse_softmax_cross_entropy_with_logits`?

More articles: