Tensorflow MNIST tutorial - Test accuracy is very low
I am starting with tensorflow and following this standard MNIST tutorial .
However, in contrast to the expected 92% accuracy, the accuracy obtained over the training set as well as the test suite does not exceed 67%. I am familiar with softmax and multinomial regression and got over 94% using a python script implementation as well as using sklearn.linear_model.LogisticRegression .
I tried the same thing using the CIFAR-10 dataset, in which case the accuracy was too low and about 10%, which is equal to random assignment of classes. This made me question my tensorflow setup, but I'm not sure about it.
Here is my implementation of the Tensorflow MNIST tutorial . I would ask if someone can have a look at my implementation.
source to share
You built your graph, specified a loss function, and created an optimizer (which is correct). The problem is that you only use your optimizer once:
sess_tf.run(train_step, feed_dict={x: train_images_reshaped[0:1000], y_: train_labels[0:1000]})
So basically you only run your gradient descent once. Clearly, you cannot converge quickly after just one tiny step in the right direction. You need to do something line by line:
for _ in xrange(many_steps):
X, Y = get_a_new_batch_from(mnist_data)
sess_tf.run(train_step, feed_dict={x: X, y_: Y})
If you can't figure out how to change my pseudocode, refer to the tutorial, because based on my memory, they figured it out perfectly.
source to share
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
Initialization W
can cause your network to learn nothing but random guesses. Since grad will be zero and backprop doesn't really work at all.
Better start W
with help tf.Variable(tf.truncated_normal([784, 10], mean=0.0, stddev=0.01))
for more details https://www.tensorflow.org/api_docs/python/tf/truncated_normal .
source to share