Can't reproduce a working bitwise C coding feature in Python

I am developing a proprietary network protocol that generates a (static) one-time block on startup and then uses it to encode / decode every packet sent / received. It uses a one-shot block in a series of complex XORs, shifts, and multiplications.

I have executed the following C code after going through the decoding function in the program using IDA. This function encodes / decodes data perfectly:

void encodeData(char *buf)
{
    int i;
    size_t bufLen = *(unsigned short *)buf;
    unsigned long entropy = *((unsigned long *)buf + 2);
    int xorKey = 9 * (entropy ^ ((entropy ^ 0x3D0000) >> 16));
    unsigned short baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;

    //Skip first 24 bytes, as that is the header
    for (i = 24; i <= (signed int)bufLen; i++)
        buf[i] ^= byteTable[((unsigned short)i + baseByteTableIndex) & 2047];
}

      

Now I want to try my hand at creating a Peach fuser for this protocol. Since I need a custom Python setup to do the encoding / decoding, in order to do the encoding / decoding, I need to port this code to Python.

I executed the following Python function, but I was unable to decode the received packets.

def encodeData(buf):
    newBuf = bytearray(buf)
    bufLen = unpack('H', buf[:2])
    entropy = unpack('I', buf[2:6])
    xorKey = 9 * (entropy[0] ^ ((entropy[0] ^ 0x3D0000) >> 16))
    baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;
    #Skip first 24 bytes, since that is header data
    for i in range(24,bufLen[0]):
        newBuf[i] = xorPad[(i + baseByteTableIndex) & 2047]
    return str(newBuf)

      

I've tried using and without using array()

or pack()

/ unpack()

for various variables to get them to be the correct size for bitwise operations, but I must be missing something because I can't get Python code to work with C code. Does anyone know that am I missing?

If it helps you try it locally, here is a one-off pad generation function:

def buildXorPad():
    global xorPad
    xorKey = array('H', [0xACE1])
    for i in range(0, 2048):
        xorKey[0] = -(xorKey[0] & 1) & 0xB400 ^ (xorKey[0] >> 1)
        xorPad = xorPad + pack('B',xorKey[0] & 0xFF)

      

And here is the original (encoded) and decoded hex encoded packet.

Original: 20000108fcf3d71d98590000010000000000000000000000a992e0ee2525a5e5

Decoded: 20000108fcf3d71d98590000010000000000000000000000ae91e1ee25252525

Decision

It turns out my problem had little to do with the difference between C and Python types, but rather some simple programming mistakes.

def encodeData(buf):
    newBuf = bytearray(buf)
    bufLen = unpack('H', buf[:2])
    entropy = unpack('I', buf[8:12])
    xorKey = 9 * (entropy[0] ^ ((entropy[0] ^ 0x3D0000) >> 16))
    baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;
    #Skip first 24 bytes, since that is header data
    for i in range(24,bufLen[0]):
        padIndex = (i + baseByteTableIndex) & 2047
        newBuf[i] ^= unpack('B',xorPad[padIndex])[0]
    return str(newBuf)

      

Thanks everyone for your help!

+3


source to share


2 answers


This line C:

unsigned long entropy = *((unsigned long *)buf + 2);

      

should translate into

entropy = unpack('I', buf[8:12])

      

because it buf

is passed to unsigned long first before adding 2 to the address, which adds the size of 2 unsigned lengths to it, not 2 bytes (assuming the unsigned length is 4 bytes).



also:

newBuf[i] = xorPad[(i + baseByteTableIndex) & 2047]

      

it should be

newBuf[i] ^= xorPad[(i + baseByteTableIndex) & 2047]

      

to match C, otherwise the output is not based on the contents of the buffer.

+2


source


Python integers do not overflow - they are automatically promoted to arbitrary precision when they exceed sys.maxint

(or -sys.maxint-1

).

>>> sys.maxint
9223372036854775807
>>> sys.maxint + 1
9223372036854775808L

      

Usage array

and / or unpack

doesn't seem to be affected (as you found)



>>> array('H', [1])[0] + sys.maxint
9223372036854775808L
>>> unpack('H', '\x01\x00')[0] + sys.maxint
9223372036854775808L

      

In order to truncate your numbers, you will need to simulate the overflow manually with ANDing with an appropriate bit mask whenever you increase the size of the variable.

+1


source







All Articles