Converting a Caffe Model to CoreML

I am working on understanding CoreML. For the starter model, I downloaded the Yahoo Open NSFW caffemodel. You give it an image, it gives you an estimate of the likelihood (0 to 1) that the image contains inappropriate content.

Using coremltools , I converted the model to .mlmodel and brought it into my application. It appears in Xcode like this:

enter image description here

In my application, I can transfer the image successfully and the output looks like MLMultiArray . Where I'm having trouble I understand how to use this MLMultiArray to get an estimate of the likelihood. My code looks like this:

func testModel(image: CVPixelBuffer) throws {

    let model = myModel()
    let prediction = try model.prediction(data: image)
    let output = prediction.prob // MLMultiArray
    print(output[0]) // 0.9992402791976929
    print(output[1]) // 0.0007597212097607553
}

      

For reference, the CVPixelBuffer will be changed to the required 224x224 which the model asks for (I'll play with Vision as soon as I can figure it out).

The two indices I printed to the console change if I provide a different image, but their estimates are very different from the result I get if I run the model in Python. The same image passed to the model when tested in Python gives me a 0.16 result, whereas my CoreML output as per the example above is very different (and dictionary, as opposed to Python's double output) than what I expect see.

Does it take more work to get the result as I expect?

+3


source to share


1 answer


It seems that you are not changing the input image in the same way the model would expect.
Most caffe models expect to display "average subtracted" images as input, just like this model. If you check the python code with Yahoo Open NSFW ( classify_nsfw.py

):

# Note that the parameters are hard-coded for best results
caffe_transformer = caffe.io.Transformer({'data': nsfw_net.blobs['data'].data.shape})
caffe_transformer.set_transpose('data', (2, 0, 1))  # move image channels to outermost
caffe_transformer.set_mean('data', np.array([104, 117, 123]))  # subtract the dataset-mean value in each channel
caffe_transformer.set_raw_scale('data', 255)  # rescale from [0, 1] to [0, 255]
caffe_transformer.set_channel_swap('data', (2, 1, 0))  # swap channels from RGB to BGR

      

There is also a certain way: resize to 256x256 and then crop to 224x224 .



To get exactly the same results, you will need to transform the input image exactly the same on both platforms.

See this thread for more information .

+2


source







All Articles