Sigmoid activation of neural networks with displacement changes

I am trying to figure out if I am creating an artificial neural network using the sigmoid activation function and using bias correctly. I want one offset node to be injected into all hidden nodes with a static output of -1 combined with its weight, and then one of them will output also a static output of -1 combined with its weight. Then I can train those biases in the same way I train other neurons, right ?!

Artificial Neural Network


source to share

1 answer

This is correct reasoning, but it is quite rare to set the value to "-1" (why not +1?), I have never seen this in the literature before. If you maintain the correct plot structure, then there is no difference between the update weights for "real" nodes and "offset nodes". The only difference can arise if you do not store the structure of the graph, and therefore you do not "know" that the offset (which is associated with the output of the node) has no "children", and therefore the signal does not "propagate back" deeper into the network. I've seen codes like this that just store the layers as arrays and put the offsets at index 0 so that they can iterate over 1 during backpropagation. Obviously the graph-based implementation is much readable (but much slower,than you can't vectorize your calculations).



All Articles