Neural network for unbalanced multiclass classification with multiple labels

How to deal with mutli-label classification that has unbalanced results when training neural networks? One solution I came across was to punish a bug for sparse marked classes. This is how I designed the web:

Number of classes: 100. Input level, 1st hidden layer and 2nd level (100) are completely related to dropdowns and ReLU. The result of the second hidden layer is py_x.

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=py_x, labels=Y))

      

Where Y is a modified version of a single hot coding with values โ€‹โ€‹from 1 to 5 set for all sample marks. The value will be ~ 1 for the most frequent tags and ~ 5 for the rarest tags. The value is not discrete, that is, the new value to be set in the label in hot coding mode is based on the formula

= 1 + 4*(1-(percentage of label/100))

      

For example: <0, 0, 1, 0, 1, ....> will be converted to something like <0, 0, 1.034, 0, 3.667, ...>. NOTE. Only the 1 values โ€‹โ€‹in the original vectors are changed.

Thus, if the model mispredicts a rare tag, its error will be high, for example: 0.0001-5 = -4.9999, and this will propagate a more severe error compared to mislabelling a very frequent tag.

Is this punishment correct? Are there any better methods for solving this problem?

+8


source to share


1 answer


Let me answer your problem in general terms. What you are running into is the class imbalance problem, and there are many ways to solve this problem. General ways:

  1. Resampling the dataset : Balance the classes by resizing the dataset.
    For example, if you have 5 target classes (from class A to E), and classes A, B, C and D have 1000 examples each, and class E has 10 examples, you can simply add 990 more examples from the class E (just copy this or copy and some noise to it).
  2. Cost-sensitive modeling : changing the importance (weight) of different classes.
    This is the method you used in your code when you increased the importance (weight) of a class by no more than 5 times.

Coming back to your problem, the first solution is independent of your model. You just need to check if you can change the dataset (add more samples to classes with fewer samples, or remove samples from classes with more samples). For the second solution, since you are working with a neural network, you need to change the formula for the loss function. You can define multiple hyperparameters (class weights or importance) and train your model to see which set of parameters performs best.



So, to answer your question, yes, this is the correct way to punish, but perhaps you will get better accuracy by trying different weights (instead of 5 in your example). Alternatively, you can try changing the sampling of the dataset.

For more information, you can refer to this link .

+1


source







All Articles