How does the basic keras optimizer work?

Question

How does the basic keras optimizer work?

Here is some of the code get_updates

from SGD

from keras

( source )

moments = [K.zeros(shape) for shape in shapes]
self.weights = [self.iterations] + moments
for p, g, m in zip(params, grads, moments):
    v = self.momentum * m - lr * g  # velocity
    self.updates.append(K.update(m, v))

Comment:

Since the variable moments

is a list of tensors of zeros. Each m

in for loop

represents a zero tensor with a shape p

. Then self.momentum * m

, in the first line of the loop, is just scalar multiplication by the zero tensor, which results in the zero tensor.

Question

What am I missing here? Thank!

+3

deep-learning machine-learning neural-network keras

oak 04 jul. 17 at 8:24

source to share

1 answer

Marcin Możejko · Accepted Answer · 2017-07-04T10:09:42+0000

Yes - during the first iteration of this loop it m

is 0. But then it is updated with the current value v

in this line:

self.updates.append(K.update(m, v))

So in the next iteration, you will:

v = self.momentum * old_velocity - lr * g  # velocity

where old_velocity

is the previous value v

.

How does the basic keras optimizer work?

Comment:

Question

More articles: