What is causing the split error in this CUDA core?

I recently noticed an error when launching our program on different hardware. I was able to track things down to the initial part of the kernel where the position within the grid was calculated using modulo (%) and the division operator (/). This is a minimal working example that reproduces the error:

#include <stdio.h>

__global__ void div_issue( int blocks_x, int* block_offset)
{
    int blks_x = blocks_x;


    //number of block 2d
    int block_id_2d = block_offset[0]; 

    //x-coordinate of block in absolute grid
    int block_idx = block_id_2d % blks_x;

    //y-coordinate of block in absolute grid    
    int block_idy = (block_id_2d - block_idx) / blks_x;

    printf("%d mod %d = %d \n", block_id_2d, blks_x, block_idx);
    printf("%d / %d = %d \n", block_id_2d - block_idx, blks_x, block_idy);

}    

int main(int argc, char *argv[])
{
    int dev_count;
    cudaGetDeviceCount(&dev_count);
    for (unsigned int i=0; i < dev_count; i++)
    {
        cudaSetDevice(i);
        cudaDeviceProp prop;
        cudaGetDeviceProperties(&prop, i);
        printf("using device %s :\n\n", prop.name);

        int block_offset_host[1];
        block_offset_host[0] = 753;
        int* block_offset_dev;
        cudaMalloc(&block_offset_dev, sizeof(int));
        cudaMemcpy(block_offset_dev, block_offset_host, sizeof(int), cudaMemcpyHostToDevice);

        div_issue <<<1, 1 >>>( 251 , block_offset_dev);

        cudaDeviceSynchronize();

        printf("\n");

    }
}

      

Result (on my machine with 2 GPUs):

using device GeForce GTX 980 Ti :
753 mod 251 = 0 
753 / 251 = 4 
using device GeForce GTX TITAN Black :
753 mod 251 = 0 
753 / 251 = 3 

      

I am using CUDA 7.0, Visual Studio 2012, 9.18.13.5306 WHQL on Windows 8.

I have no other hint that the 980Ti could be damaged by hardware. Can anyone confirm the issue on their own hardware?

So far, this doesn't seem to be happening in debug mode. No additional parameter was used for compilation, eg -use_fast_math

-prec-div=false

-prec-sqrt=false

.

+3


source to share


2 answers


After submitting the bug report, Nvidia was confirmed to be a compiler bug that they already knew about. It is supposed to be installed in CUDA 7.5 (final, not RC).



0


source


I had the same error from cuda 7.5 rc

the compiler ( cuda 7.5.7rc

, linux ubuntu 14.04

, titan X

), but it seems corrected for cuda 7.5.18 (final release)

.



+1


source







All Articles