How does the shuffle = 'batch' .fit () layer argument work in the background?
When I train the model with a layer .fit()
, the Shuffle parameter is set to True.
Let's say my dataset has 100 samples and the batch size is 10. When I set shuffle = True
then keras randomly selects the samples first (now 100 samples have a different order) and in the new order this will start creating batches: batch 1: 1- 10, lot 2: 11-20, etc.
If I installed shuffle = 'batch'
, how is it supposed to run in the background? Intuitively and using the previous example of 100 sample data with batch size = 10, I am assuming keras distributes samples to batches first (i.e. batch 1: samples 1-10 after the initial order of the dataset, batch 2: 11-20 after the initial order of the kit data, batch 3 ... so on, etc.) and then shuffles the batch order. So the model will now be trained in random ordered batches, e.g .: 3 (contains samples 21-30), 4 (contains samples 31 - 40), 7 (contains samples 61 - 70), 1 (contains samples 1 - 10) , ... (I made up the order of the games).
I think that's correct, or am I missing something?
Thank!
source to share
Looking at the implementation at this link (line 349 for training.py) the answer seems to be yes.
Try this code to check:
import numpy as np
def batch_shuffle(index_array, batch_size):
"""Shuffles an array in a batch-wise fashion.
Useful for shuffling HDF5 arrays
(where one cannot access arbitrary indices).
# Arguments
index_array: array of indices to be shuffled.
batch_size: integer.
# Returns
The `index_array` array, shuffled in a batch-wise fashion.
"""
batch_count = int(len(index_array) / batch_size)
# to reshape we need to be cleanly divisible by batch size
# we stash extra items and reappend them after shuffling
last_batch = index_array[batch_count * batch_size:]
index_array = index_array[:batch_count * batch_size]
index_array = index_array.reshape((batch_count, batch_size))
np.random.shuffle(index_array)
index_array = index_array.flatten()
return np.append(index_array, last_batch)
x = np.array(range(100))
x_s = batch_shuffle(x,10)
source to share