What is the performance impact of atomic operations computed in the compute shader?

I have a compute shader that changes texels in a 256x256 texture.

There are 256x256x256 calls in the compute shader, where the x and y components of the call are directly mapped to u and v texel coordinates. Thus, each texel can be written up to 256 times.

I want every call to the compute shader to check what is currently in the given texel and run some tests to decide if they should be overwritten or not. However, to avoid the concurrency issue of these, all getting the texel value before any other call written on it, I'm looking to use an atomic operation to write the texture values.

However, I was told that this breaks the point of parallelization of the operation, as the atomic operations force everything else to wait until it is finished, which means that each z-call to the compute shader must go sequentially as they wait for the previous one to write the texture atomic.

Is this so, and if so, how much will it affect performance? It's worth noting that the call to z can vary and can be much larger than 256

+3
c ++ atomic opengl textures compute-shader


source to share


No one has answered this question yet

Check out similar questions:

8499
What is the "->" operator in C ++?
1994
What are the basic rules and idioms for operator overloading?
1250
Replacing 32-bit loop counter with 64-bit values ​​leads to crazy performance deviations
five
OpenGL evaluates shader - strange results
1
Fetch GL_TEXTURE_3D in fragment shader
1
How to read full range of 32 bit integer texture in GLSL
1
Compute shader - gl_GlobalInvocationID and local_size
1
Barrier () semantics in opengl compute shader
1
How does ImageAtomicExchange work?
0
Writing to image2DArray in compute shader



All Articles
Loading...
X
Show
Funny
Dev
Pics