CNTK: Create MinibatchSource from numpy array for multi-GPU training

I have preprocessed image data in a numpy array and my script works fine with one GPU feeds a numpy array.From what I understood, we need to create a MinibatchSource for multiple GPU training. I am checking this example ( ConvNet_CIFAR10_DataAug_Distributed.py ) for distributed learning, however it uses *_map.txt

which is basically a list of paths to an image file (like png). I am wondering what is the best way to create a MinibatchSource from a numpy array instead of converting the numpy array back to png files.

+3


source to share


1 answer


You can create composite readers that combine multiple image deserializers into a single source. First you need to create two map files (with dummy labels). One will contain all input images and the other will contain the corresponding target images. The following code is a minimal implementation, assuming the files are named map1.txt

andmap2.txt



import numpy as np
import cntk as C
import cntk.io.transforms as xforms 
import sys

def create_reader(map_file1, map_file2):
    transforms = [xforms.scale(width=224, height=224, channels=3, interpolations='linear')]
    source1 = C.io.ImageDeserializer(map_file1, C.io.StreamDefs(
        source_image = C.io.StreamDef(field='image', transforms=transforms)))
    source2 = C.io.ImageDeserializer(map_file2, C.io.StreamDefs(
        target_image = C.io.StreamDef(field='image', transforms=transforms)))
    return C.io.MinibatchSource([source1, source2], max_samples=sys.maxsize, randomize=True)

x = C.input_variable((3,224,224))
y = C.input_variable((3,224,224))
# world simplest model
model = C.layers.Convolution((3,3),3, pad=True)
z = model(x)
loss = C.squared_error(z, y)

reader = create_reader("map1.txt", "map2.txt")
trainer = C.Trainer(z, loss, C.sgd(z.parameters, C.learning_rate_schedule(.00001, C.UnitType.minibatch)))

minibatch_size = 2

input_map={
    x: reader.streams.source_image,
    y: reader.streams.target_image
}

for i in range(30):
    data=reader.next_minibatch(minibatch_size, input_map=input_map)
    print(data)
    trainer.train_minibatch(data)

      

0


source







All Articles