Why do we flatten the data before we feed it into the tensorflow?

I am following the udacity MNIST tutorial and the MNIST data is originally a matrix 28*28

. However, before feeding this data, they flatten the data into a 784-column 1d array (784 = 28 * 28)

.

For example, the original workout kit form was (200,000, 28, 28).
200,000 lines (data). Each information has a 28 * 28 matrix

They turned it into a training set, the shape of which is (200000, 784)

Can someone explain why they flatten the data before feeding it to tensorflow?

+3


source to share


2 answers


Because when you add a fully connected layer, you always want your data to be (1 or) a 2-dimensional matrix, where each row is a vector representing your data. So the fully connected layer is just a matrix multiplication between your input (size (batch_size, n_features)

) and weights (shape (n_features, n_outputs)

) (plus offset and activation function) and you get the output of the shape (batch_size, n_outputs)

. Also, you don't really need the original shape information in the fully connected layer, so you need to lose it.



It would be harder and less efficient to get the same result without first changing why we always do this in front of a fully connected layer. For a convolutional layer, by contrast, you want to keep the data in its original format (width, height).

+4


source


This is an agreement with fully bonded layers. Fully connected layers connect every node in the previous layer to every node in a sequential layer, so locality is not an issue for this type of layer.



In addition, defining such a layer, we can effectively calculate the next step, calculating the formula f(Wx + b) = y

. This would not be so easily possible with multidimensional input, and changing the input form would be inexpensive and easy to accomplish.

+2


source







All Articles