Profiling cublas applications

I am trying to profile my application that uses cuBLAS exclusively with the Nvidia Visual Profiler on Windows, however it shows that my application is not using the graphics core at all! That is, the timeline is completely empty except for the profiling overhead. Just to make sure that someone didn't change the security settings or anything underneath, I profiled the application with the kernel and cudaMemcpy

calls, and it is profiled correctly. What gives? Am I missing a setting? Linking to the wrong version of cuBLAS libraries? Or is it not actually a GPU challenge (although it seems completely incredible to me ...)? I am using Intel compiler for 64-bit support if that matters.

Thank!

+3


source to share


1 answer


For those working with this problem in the future: I had to use cudaProfilerStart()

and cudaProfilerStop()

around my cuBLAS function. Just adding it cudaDeviceSyncrhonize()

didn't fix the problem.



+5


source







All Articles