Neural network, gradient descent only finds the average of the outputs?

This problem is more conceptual than code, so the fact that it is written in JS doesn't really matter.

So I'm trying to create a neural network, and I'm testing it by trying to train it to do a simple task - an OR gate (or really just a logic gate). I use Gradient Descent without any batches for simplicity (batches seem unnecessary for this task, and for less unnecessary code it's easier for me to debug).

However, after many iterations, the output always converges to the average of the outputs. For example, given this set of workouts:

[0,0] = 0
[0,1] = 1
[1,0] = 1
[1,1] = 0

      

The outputs, regardless of the input, always converge around 0.5. If the set of workouts:

[0,0] = 0,
[0,1] = 1,
[1,0] = 1,
[1,1] = 1

      

Outputs always converge around 0.75, the average of all learning outcomes. This is similar to all combinations of outputs.

This seems to be happening because whenever he gives something with a result of 0, he changes the weight to get closer to that, and whenever he gives something with an output of 1, he changes the weight to get close to that, which means overtime will converge on average.

Here's the backpropagation code (written in Javascript):

this.backpropigate = function(data){
    //Sets the inputs
    for(var i = 0; i < this.layers[0].length; i ++){
        if(i < data[0].length){
            this.layers[0][i].output = data[0][i];
        }
        else{
            this.layers[0][i].output = 0;
        }
    }
    this.feedForward(); //Rerun through the NN with the new set outputs
    for(var i = this.layers.length-1; i >= 1; i --){
        for(var j = 0; j < this.layers[i].length; j ++){
            var ref = this.layers[i][j];
            //Calculate the gradients for each Neuron
            if(i == this.layers.length-1){ //Output layer neurons
                var error = ref.output - data[1][j]; //Error
                ref.gradient = error * ref.output * (1 - ref.output);
            }
            else{ //Hidden layer neurons
                var gradSum = 0; //Find sum from the next layer
                for(var m = 0; m < this.layers[i+1].length; m ++){
                    var ref2 = this.layers[i+1][m];
                    gradSum += (ref2.gradient * ref2.weights[j]);
                }
                ref.gradient = gradSum * ref.output * (1-ref.output);
            }
            //Update each of the weights based off of the gradient
            for(var m = 0; m < ref.weights.length; m ++){
                //Find the corresponding neuron in the previous layer
                var ref2 = this.layers[i-1][m];
                ref.weights[m] -= LEARNING_RATE*ref2.output*ref.gradient;
            }
        }
    }
    this.feedForward();
};

      

Here NN is in a structure where each neuron is an object with inputs, weights, and an output that is calculated based on the inputs / weights, and the neurons are stored in an array of 2D layers, where x size is the layer (so the first layer is the inputs, the second hidden, etc.), and the y dimension is a list of neuron objects inside this layer. The input data is entered as [data,correct-output]

, so [[0,1],[1]]

.

Also my LEARNING_RATE is 1 and my hidden layer has 2 neurons.

I feel like there are some conceptual issues in my backpropagation method as I've tested other parts of my code (like the feedForward part) and it works great. I tried to use a variety of sources, although I mainly relied on the Wikipedia article on reverse rollover and the equations it gave me.

...

...

I know it can be confusing to read my code, although I've tried to keep it as simple as possible, but any help would be greatly appreciated.

+3


source to share





All Articles