CUDA kernel concurrent priority
I have two cores (A and B) that can run at the same time. I need core A to finish as soon as possible (in order to swap MPI for result). So I can execute them in one thread: A and then B.
However, there are multiple thread blocks in core A, so if I run A and B sequentially, the GPU is not fully utilized while A is running.
Is it possible to execute A and B at the same time with a higher priority A?
I am. e., I want thread blocks from core B to run only if core A has no .
As I understand it, if I run kernel A in one thread and the next line in the main code, start kernel B in another thread, I am not guaranteeing that the blocks of threads from B will not actually be executed first?
source to share
NVIDIA now provides a way to prioritize CUDA cores. This is a fairly new feature, so you need to upgrade to CUDA 5.5 to do this.
In your case, you are running kernel A
on a high priority kernel B
CUDA thread and running on a low priority CUDA thread. The function you probably want is . cudaStreamCreateWithPriority(..., priority)
- To use this feature, you need a GPU with Compute 3.5 capabilities or higher. To check if priorities are supported on your GPU take a look
cudaDeviceProp::streamPrioritiesSupported
. -
cudaDeviceGetStreamPriorityRange
should tell you how many priority levels are available on your GPU. The syntax for is acudaDeviceGetStreamPriorityRange
bit impossible; it's worth looking into the CUDA manual to see how it works.
More detailed documentation on priority settings from the CUDA Runtime API manual :
cudaError_t cudaStreamCreateWithPriority(cudaStream_t *pStream,
unsigned int flags, int priority)
Create an asynchronous stream with the specified priority.
Parameters
pStream = Pointer to new stream identifier
flags = Flags for stream creation. See cudaStreamCreateWithFlags for a list of
valid flags that can be passed
priority = Priority of the stream. Lower numbers represent higher priorities. See
cudaDeviceGetStreamPriorityRange for more information about the
meaningful stream priorities that can be passed.
source to share