Dynamic Parallelism on GTX 980 ti: Unknown error

I am trying to dynamically parallelism on a GTX 980 ti card. All attempts to run the code return an "unknown error". Simple code is shown below with compilation options.

I can execute kernels in depth=0

no problem. The first time the child is called, an error is thrown. cudaDeviceSynchronize()

were included after looking at other questions here, but didn't fix the issue.

Any ideas? Could this be a driver issue?

Edit 1:

OS: Linux-x86_64

Nvidia driver version: 384.59

nvcc version 7.5.17

There are two 980 ti connected to PCIe x16 Gen3. The system also has windows installed on another SSD with a RAID array.

#include <cuda.h>
#include <fstream>
#include <stdio.h>
#include <stdlib.h>

__global__ void ker_two(){
int two=0;
two++;
}

__global__ void ker_one(){
int one=0;
one++;
ker_two<<<1,1>>>();
cudaDeviceSynchronize();
};

int main( ){

ker_one<<<1,1>>>();
cudaDeviceSynchronize();

cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) 
    printf("Cuda Error: %s\n", cudaGetErrorString(err));//*/

return 0;
}

      

compiled with

nvcc -arch=compute_52 -rdc=true -lcudadevrt test.cu

      

+3


source to share


1 answer


I can (?) Reproduce the error on a machine with a Maxwell Titan card. This is a Fedora 24 distribution with CUDA 8.0.61 installed manually. The driver version is 375.51.

However - it seems the problem only occurs on my system when I call cudaDeviceSynchronize()

inside ker_one()

, whether I call the second core or not. Perhaps the problem you are seeing, not dynamic parallelism per se.



Considering @talonmies comment, it might even be a driver issue.

0


source







All Articles