Semantic tensorfield segmentation gives zero loss

I am training a model to segment the printed text of a machine from images. Images can contain barcodes and handwritten text. Ground truth images are processed so that 0 represents the machine fingerprint and 1 represents the rest. And I am using a 5-layer CNN with extension which ends up outputting 2 maps.

And my loss is calculated like this:

def loss(logits, labels):

    logits = tf.reshape(logits, [-1, 2])
    labels = tf.reshape(labels, [-1])

    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels)
    cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')

      

And I have some images that only contain handwritten text and their respective ground truths are blank pages that are represented by 1s.

When I train the model, for these images I get a loss of 0 and a training accuracy of 100%. It's right? How can this loss be zero? For other images containing barcodes or machine prints, I get some loss and they converge as expected.

And when I test this model the barcodes are correctly ignored. But it outputs both a printing press and handwriting where I only need to print the machine.

Can anyone guide me to where I am going wrong, please!

UPDATE 1:

I've used a learning rate of 0.01 before and changed it to 0.0001, gave me some loss and seemed to converge, but not very well. But how then will a high level of training give a loss of 0?

When I use the same model in Caffe with a learning rate of 0.01, it gave some loss and it converges well compared to Tensorflow.

+3


source to share


1 answer


Your loss calculation looks fine, but the loss of zero in your case is strange. Have you tried playing with learning speed? Perhaps it will lessen. I ran into strange losses and decreased my learning speed.



+1


source







All Articles