Keras + Tensorflow: Debug NaNs

Here's a big question on how to find the first occurrence of Nan in a tensorflow graph:

Debugging nans in reverse pass

The answer is quite helpful, here is the code from it:

train_op = ...
check_op = tf.add_check_numerics_ops()

sess = tf.Session()
sess.run([train_op, check_op])  # Runs training and checks for NaNs

      

Apparently running training and numerical validation at the same time will result in an error message as soon as Nan is encountered for the first time.

How do I integrate this into Keras? In the documentation, I cannot find anything similar to this.

I checked the code too. The upgrade step is done here: https://github.com/fchollet/keras/blob/master/keras/engine/training.py

There is a function named _make_train_function

where the operation of calculating losses and applying updates is created. This is later called for network training.

I could change a code like this (always assuming we are working on a tf server):

check_op = tf.add_check_numerics_ops()

self.train_function = K.function(inputs, 
    [self.total_loss] + self.metrics_tensors + [check_op],
    updates=updates, name='train_function', **self._function_kwargs)

      

I am currently trying to set this correctly and am not sure if the code actually works. Maybe there is an easier way?

+3
python machine-learning neural-network tensorflow keras


source to share


No one has answered this question yet

See similar questions:

eleven
Debugging nans in reverse pass

or similar:

501
Tensorflow: how to save / restore a model?
eleven
Keras + tensorflow gives error "no attribute" control_flow_ops' "
6
How to use tensor flow model extracted from keras trainable model
3
Weird Nan loss for Keras custom loss
2
Different behavior when copying the same Tensorflow loss function in Keras
1
preprocessing input in Keras
1
keras backend theano / tensorflow
0
Tensorflow and Keras show a slightly different result, although I am building exactly the same models using the same layer modules
0
Tensorflow Dataset iterator consumes large amounts of memory
0
keras loss - nanok, but the accuracy is clearly defined



All Articles
Loading...
X
Show
Funny
Dev
Pics