Unbalanced data and weighted cross-entropy

Question

Unbalanced data and weighted cross-entropy

I am trying to train a network with unbalanced data. I have A (198 samples), B (436 samples), C (710 samples), D (272 samples) and I read about "weighted_cross_entropy_with_logits" but all the examples I found are for binary classification, so I not very sure how to set these weights.

Total samples: 1616

A_weight: 198/1616 = 0.12?

The idea, if I understood, punishes the mistakes of the city hall class and more positively evaluates the punches in the minority, right?

My code snippet:

weights = tf.constant([0.12, 0.26, 0.43, 0.17])
cost = tf.reduce_mean(tf.nn.weighted_cross_entropy_with_logits(logits=pred, targets=y, pos_weight=weights))

I have read this and other binary classification examples but still not very clear.

Thanks in advance.

+42

python deep-learning machine-learning tensorflow

Sergiodiaz53 June 15. 17 at 6:51

source to share

3 answers

Check out this answer for an alternative solution that works with sparse_softmax_cross_entropy:

import  tensorflow as tf
import numpy as np

np.random.seed(123)
sess = tf.InteractiveSession()

# let say we have the logits and labels of a batch of size 6 with 5 classes
logits = tf.constant(np.random.randint(0, 10, 30).reshape(6, 5), dtype=tf.float32)
labels = tf.constant(np.random.randint(0, 5, 6), dtype=tf.int32)

# specify some class weightings
class_weights = tf.constant([0.3, 0.1, 0.2, 0.3, 0.1])

# specify the weights for each sample in the batch (without having to compute the onehot label matrix)
weights = tf.gather(class_weights, labels)

# compute the loss
tf.losses.sparse_softmax_cross_entropy(labels, logits, weights).eval()

0

DankMasterDan 12 nov. 18 at 23:21

source to share

Read from here . This will resolve all your doubts. Read the new thread for a complete understanding.

0

PB 02 dec. 18 at 16:25

source to share

P-Gn · Accepted Answer · 2017-06-15T08:54:10+0000

Please note that this weighted_cross_entropy_with_logits

is a balanced option sigmoid_cross_entropy_with_logits

. Sigmoid cross entropy is commonly used for binary classification. Yes, it can handle multiple labels, but sigmoid cross-entropy basically makes a (binary) solution for each one - for example, for a facial recognition network, these (non-mutually exclusive) labels might be: "Does the subject mean glasses?", "Is is the topic a woman? " Etc.

In binary classification (s), each output channel corresponds to a binary (soft) decision. Therefore, weighing must be done when calculating losses. This is what it weighted_cross_entropy_with_logits

does by weighing one term of the cross-entropy over the other.

In a mutually exclusive multi-segment classification, we use softmax_cross_entropy_with_logits

one that behaves differently: each output channel corresponds to the score of a candidate class. The solution comes after, by comparing the corresponding outputs of each channel.

Weighing to final judgment is simply a matter of changing estimates before comparing them, usually by multiplying with weights. For example, for the triple classification problem

# your class weights
class_weights = tf.constant([[1.0, 2.0, 3.0]])
# deduce weights for batch samples based on their true label
weights = tf.reduce_sum(class_weights * onehot_labels, axis=1)
# compute your (unweighted) softmax cross entropy loss
unweighted_losses = tf.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
# apply the weights, relying on broadcasting of the multiplication
weighted_losses = unweighted_losses * weights
# reduce the result to get your final loss
loss = tf.reduce_mean(weighted_losses)

You can also rely on tf.losses.softmax_cross_entropy

to handle the last three steps.

In your case, when you need to solve a data imbalance problem, the class weights can actually be inversely proportional to their frequency in your train data. Normalizing them so that they add up to one or a number of classes also makes sense.

Please note that in the above case, we have penalized the loss based on the true label of the samples. We could also penalize losses based on assessed tags by simply defining

weights = class_weights

and the rest of the code shouldn't change thanks to broadcast magic.

In general, you would like the scale to depend on the mistake you make. In other words, for each pair of labels X

and Y

you can choose how to penalize label selection X

when the true label is Y

. You end up with an integer matrix of weights, which results in the weights

above being a complete tensor (num_samples, num_classes)

. It depends a little on what you want, but it can be useful to know, however, that only your definition of the weight tensor needs to be changed in the code above.

Unbalanced data and weighted cross-entropy

More articles: