Keras + Tensorflow: Debug NaNs
Here's a big question on how to find the first occurrence of Nan in a tensorflow graph:
Debugging nans in reverse pass
The answer is quite helpful, here is the code from it:
train_op = ...
check_op = tf.add_check_numerics_ops()
sess = tf.Session()
sess.run([train_op, check_op]) # Runs training and checks for NaNs
Apparently running training and numerical validation at the same time will result in an error message as soon as Nan is encountered for the first time.
How do I integrate this into Keras? In the documentation, I cannot find anything similar to this.
I checked the code too. The upgrade step is done here: https://github.com/fchollet/keras/blob/master/keras/engine/training.py
There is a function named _make_train_function
where the operation of calculating losses and applying updates is created. This is later called for network training.
I could change a code like this (always assuming we are working on a tf server):
check_op = tf.add_check_numerics_ops()
self.train_function = K.function(inputs,
[self.total_loss] + self.metrics_tensors + [check_op],
updates=updates, name='train_function', **self._function_kwargs)
I am currently trying to set this correctly and am not sure if the code actually works. Maybe there is an easier way?
source to share
No one has answered this question yet
See similar questions:
or similar: