How do I use custom compute shaders using metal and get very smooth performance?
I am trying to apply live camera filters through metal using the default filters MPSKernal
given by apple and custom compute Shaders
.
In compute shader I did inplace encoding using MPSImageGaussianBlur and the code goes here
func encode(to commandBuffer: MTLCommandBuffer, sourceTexture: MTLTexture, destinationTexture: MTLTexture, cropRect: MTLRegion = MTLRegion.init(), offset : CGPoint) {
let blur = MPSImageGaussianBlur(device: device, sigma: 0)
blur.clipRect = cropRect
blur.offset = MPSOffset(x: Int(offset.x), y: Int(offset.y), z: 0)
let threadsPerThreadgroup = MTLSizeMake(4, 4, 1)
let threadgroupsPerGrid = MTLSizeMake(sourceTexture.width / threadsPerThreadgroup.width, sourceTexture.height / threadsPerThreadgroup.height, 1)
let commandEncoder = commandBuffer.makeComputeCommandEncoder()
commandEncoder.setComputePipelineState(pipelineState!)
commandEncoder.setTexture(sourceTexture, at: 0)
commandEncoder.setTexture(destinationTexture, at: 1)
commandEncoder.dispatchThreadgroups(threadgroupsPerGrid, threadsPerThreadgroup: threadsPerThreadgroup)
commandEncoder.endEncoding()
autoreleasepool {
var inPlaceTexture = destinationTexture
blur.encode(commandBuffer: commandBuffer, inPlaceTexture: &inPlaceTexture, fallbackCopyAllocator: nil)
}
}
But sometimes the inplace texture tends to fail and end up creating a jerk effect on the screen.
So, if anyone can suggest me a solution without using inplace texture or how to use fallbackCopyAllocator
or using compute Shaders
in a different way that would be really helpful.
source to share
I have done enough coding in this area (applying computer shaders to the video stream from the camera) and the most common problem you encounter is the "pixel buffer reuse" issue.
The metal texture you create from the sample buffer is reserved by a pixel buffer, which is controlled by the video session, and can be reused for the next video frames if you don't keep the reference to the sample buffer (keeping the reference to the metal texture is not enough).
Don't forget to take a look at my code at https://github.com/snakajima/vs-metal that applies various computer shaders to the live video stream.
The VSContext: set () method takes an optional sampleBuffer parameter in addition to the texture parameter and maintains a reference to the sampleBuffer until the compute shader finishes computation (in the VSRuntime: encode () method).
source to share
The onsite method can be skipped or skipped depending on what the main filter is doing. If it is a single pass filter for some parameters, then in these cases you end up inappropriate.
Since this method was added, MPS has added a basic MTLHeap to manage memory more transparently for you. If your MPSImage does not need to be viewed by the CPU and only exists for a short period of time on the GPU, it is recommended that you use MPSTemporaryImage instead. When readCount is 0, the backing store will be recycled across the MPS heap and made available to other MPSTemporaryImages and other temporary resources used in the downstream. Likewise, the backing store for it is not actually allocated from the heap until absolutely necessary (for example, a texture is written or called or text.) A separate heap is allocated for each command buffer.
The use of temporary images should significantly reduce memory usage. For example, on a graph of the Inception v3 neural network that has over a hundred passes, the heap was able to automatically reduce the graph to four distributions.
source to share