What is the difference between tf.gradients and tf.train.Optimizer.compute_gradient?

It seems tf.gradients

to allow computation of the Jacobians as well, i.e. partial derivatives of each record of one tensor with respect to. each write a different tensor, whereas tf.train.Optimizer.compute_gradient

only calculates the actual gradients, eg. partial derivatives of a scalar value with respect to. each record of a specific tensor or sweat. one particular scalar. Why is there a separate function if it tf.gradients

also implements that functionality?

+3


source to share


1 answer


tf.gradients

doesn't let you compute the Jacobian, it combines the gradients of each input for each output (a bit like summing each column of the actual Jacobi matrix). In fact, there is no "good" way of computing the Jacobians in TensorFlow (basically you have to call tf.gradients

once per output, see this issue ).



As for tf.train.Optimizer.compute_gradients

, yes, its result is basically the same, but taking care of some details automatically and with a slightly more convenient output format. If you look at the implementation , you will see that it is essentially a call tf.gradients

(in this case with an alias gradients.gradients

), but it is useful for the optimizer implementation to have the surrounding logic already implemented. In addition, using this method as a method allows you to extend behavior in subclasses, either implement some kind of optimization strategy (actually not very important at the stage compute_gradients

), or for auxiliary purposes such as tracing or debugging.

+3


source







All Articles