Updates the argument in anano functions

Question

Updates the argument in anano functions

What does the "updates" argument do when called this way?

f_grad_shared = theano.function([x, mask, y], cost, updates=zgup + rg2up,
                                    name='adadelta_f_grad_shared')

All the documentation I've seen about the "updates" argument in anano functions talks about form pairs (shared variables, an expression used to update a shared variable). However, there is only an expression here, and how do I know which shared variable has been updated?

I think that the total variable somehow implicitly, but zgup

, and rg2up

both depend on different variables in common:

zipped_grads = [theano.shared(p.get_value() * numpy_floatX(0.),
                              name='%s_grad' % k)
                for k, p in tparams.iteritems()]

running_grads2 = [theano.shared(p.get_value() * numpy_floatX(0.),
                                name='%s_rgrad2' % k)
                  for k, p in tparams.iteritems()]

zgup = [(zg, g) for zg, g in zip(zipped_grads, grads)]
rg2up = [(rg2, 0.95 * rg2 + 0.05 * (g ** 2))
         for rg2, g in zip(running_grads2, grads)]

This code comes from lstm.py

at http://deeplearning.net/tutorial/lstm.html

thank

+3

python theano

nisace 22 jul. 15 at 8:42

source to share

1 answer

Daniel Renshaw · Accepted Answer · 2015-07-22T09:01:32+0000

It is correct to think that there updates

should be a list (or dictionary) of key pairs, where the key is a shared variable and the value is a symbolic expression describing how to update the corresponding shared variable.

These two lines create pairs:

zgup = [(zg, g) for zg, g in zip(zipped_grads, grads)]
rg2up = [(rg2, 0.95 * rg2 + 0.05 * (g ** 2))
         for rg2, g in zip(running_grads2, grads)]

zipped_grads

and running_grads2

were created in the previous lines, each is a list of shared variables. Here, these shared variables are linked to updates via a Python function zip

that emits a list of pairs. In fact, the first of these lines can be replaced with

zgup = zip(zipped_grads, grads)

This code is quite complex as it implements the AdaDelta update mechanism. If you want to see how it updates

works in a simpler setup, take a look at the basic stochastic gradient descent update in Theano MLP tutorial .

updates = [
        (param, param - learning_rate * gparam)
        for param, gparam in zip(classifier.params, gparams)
    ]

Updates the argument in anano functions

More articles: