Can I use CNN layer normalization?

I see that level normalization is a modern normalization technique, not standard batch normalization, and it is very easy to code in Tensorflow. But I think layer normalization is for RNN and batch normalization for CNN. Can I use CNN level normalization that handles the image classification task? What are the criteria for choosing batch or layer normalization?

+3


source to share


2 answers


You can use Layer normalisation

CNN, but I don't think it's more "modern" than Batch Norm

. They both normalize in different ways. Layer norm

normalizes all activations of the same level from the batch, collecting statistics from each unit within the layer, and Batch Norm

normalizes the entire batch for each individual activation, where statistics are collected for each individual item across the entire batch.



Batch Norm

generally preferable Layer norm

because it tries to normalize each activation to a unit Gaussian distribution, but Layer norm

tries to "average" all activations to a unit Gaussian distribution . But if the batch size is too small to collect reasonable statistics, it is preferable Layer norm

.

+4


source


I would also like to add, as stated in the original paper for Layer Norm, page 10, section 6.7 , Layer Norm is not recommended to be used and the authors will tell "more research needed" for CNN



Also, Heads-Up is for RNN, Layer Rate seems to be a better choice than Batch Norm, because training cases can be of different lengths in the same mini camera.

+1


source







All Articles