Invalid value after setting (scalar) OpenCL kernel argument

I am working on an OpenCL program, but the result is different from each execution. I think it has to do with passing arguments to the kernel, because when I hardcode values ​​for a particular execution, the result is the same after each execution.

My kernel looks like this:

__kernel void sample_kernel(__global double *BufferA, int scalar1, int scalar2, int scalar3, ...) {

    for(int i = -1*scalar1; i < scalar1; i++) {
        for(int j = -1*scalar1; j < scalar1, j++) {
            if(scalar2 > 0 && scalar3 > 0) // do something.
        }
    }
}

      

And this is how I set the kernel arguments:

int scalar1 = 1;
int scalar2 = 2;
int scalar3 = 3;

Samplekernel.setArg(0, d_BufferA);
Samplekernel.setArg(1, sizeof(int), &scalar1);
Samplekernel.setArg(2, sizeof(int), &scalar2);
Samplekernel.setArg(3, sizeof(int), &scalar3);

      

The weird thing is that when I add ...

if(scalar1 != 1) scalar1 = 1;
if(scalar2 != 2) scalar2 = 2;
if(scalar3 != 3) scalar3 = 3;

      

... in the kernel before the double for loop, the result is correct.

I am running my program on Nvidia K20m GPU, OpenCL version 1.1. When I run my code on Nvidia C2075 everything works fine ...

Does anyone have any idea what the problem might be? It looks like the value was not copied correctly or overwritten, but I cannot access that value until for-loops ...

Thanks in advance!

+3


source to share


1 answer


It looks like you are passing a pointer to int in setArg

Samplekernel.setArg(1, sizeof(int), &scalar1);

      

and then in your kernel paramater list, you have ints values, not pointers:

__kernel void sample_kernel(__global double *BufferA, int scalar1, ...

      

You can use pointers in the kernel parameter list, for example:

__kernel void sample_kernel(__global double *BufferA, global int *scalar1,

      



Or - and this is what I'd like to suggest, since I couldn't find your version of kernel.setArg (...) in the C ++ binding spec, but for some reason only in the implementation at khronos.org - directly copy scalar like so:

Samplekernel.setArg(1, scalar1);

      

This also has the advantage that the variable is available in the private memory space of the kernel, rather than in the global space, as would be the case with a buffer as an argument.

The Kernel :: setArg version you are using may not copy the value, but can only be used for linked kernel kernels, but I'm not sure about that.

Alternatively, you can check the return value of setArg for errors.

+2


source







All Articles