Why is this cycle so slow in Keaton?
This code reorders the bits in the RGBA4 534x713 texture.
cpdef bytes toDDSrgba4(bytearray data):
cdef bytes new_data = b''
cdef int pixel
cdef int red
cdef int green
cdef int blue
cdef int alpha
cdef int new_pixel
cdef int i
for i in range(len(data) // 2):
pixel = int.from_bytes(data[2*i:2*i+2], "big")
red = (pixel >> 12) & 0xF
green = (pixel >> 8) & 0xF
blue = (pixel >> 4) & 0xF
alpha = pixel & 0xF
new_pixel = (red << 8) | (green << 4) | blue | (alpha << 12)
new_data += (new_pixel).to_bytes(2, "big")
return new_data
This is as fast as the Python equivalent, which is:
def toDDSrgba4(data):
new_data = b''
for i in range(len(data) // 2):
pixel = int.from_bytes(data[2*i:2*i+2], "big")
red = (pixel >> 12) & 0xF
green = (pixel >> 8) & 0xF
blue = (pixel >> 4) & 0xF
alpha = pixel & 0xF
new_pixel = (red << 8) | (green << 4) | blue | (alpha << 12)
new_data += (new_pixel).to_bytes(2, "big")
return new_data
Both are very slow.
I have written a complicated code-swizzle really , who does not even optimized and tested it on this texture, and it is still waaay faster than this.
+3
source to share