C struct => ctypes struct ... is this mapping correct?

I'm trying to access two legacy de / compression functions from Python, which are written in C and are currently available via DLLs (I have a C source).

Functions are passed in a (partially) filled C structure and use this information to compress or decompress the data in the provided buffer.

This is what functions are called. I added __cdecl for Python compatibility.

// Both functions return 0 on success and nonzero value on failure
int __cdecl pkimplode(struct pkstream *pStr);
int __cdecl pkexplode(struct pkstream *pStr);


Here's a pkstream structure defined in C:

struct pkstream {
   unsigned char *pInBuffer;           // Pointer to input buffer
   unsigned int nInSize;               // Size of input buffer
   unsigned char *pOutBuffer;          // Pointer to output buffer
   unsigned int nOutSize;              // Size of output buffer upon return
   unsigned char nLitSize;             // Specifies fixed or var size literal bytes
   unsigned char nDictSizeByte;        // Dictionary size; either 1024, 2048, or 4096
   // The rest of the members of this struct are used internally,
   // so setting these values outside pkimplode or pkexplode has no effect
   unsigned char *pInPos;              // Current position in input buffer
   unsigned char *pOutPos;             // Current position in output buffer
   unsigned char nBits;                // Number of bits in bit buffer
   unsigned long nBitBuffer;           // Stores bits until enough to output a byte
   unsigned char *pDictPos;            // Position in dictionary
   unsigned int nDictSize;             // Maximum size of dictionary
   unsigned int nCurDictSize;          // Current size of dictionary
   unsigned char Dict[0x1000];         // Sliding dictionary used for compdecomp


This is my attempt at mirroring this structure in Python.

# Define the pkstream struct
class PKSTREAM(Structure):
   _fields_ = [('pInBuffer', c_ubyte),
               ('nInSize', c_uint),
               ('pOutBuffer', c_ubyte),
               ('nOutSize', c_uint),
               ('nLitSize', c_ubyte),
               ('nDictSizeByte', c_ubyte),
               ('pInPos', c_ubyte),
               ('pOutPos', c_ubyte),
               ('nBits', c_ubyte),
               ('nBitBuffer', c_ulong),
               ('pDictPos', c_ubyte),
               ('nDictSize', c_uint),
               ('nCurDictSize', c_uint),
               ('Dict', c_ubyte)]


I would really appreciate help with the following questions (which I prefer to ask questions on the interface rather than just "close" it, hopefully for obvious reasons):

  • I'm not sure whether to use c_ubyte, c_char, or c_char_p for unsigned char members . c_ubyte most closely matches ctypes for unsigned char (according to the docs, at least), but really is? int / long? in Python.

  • Sometimes a member is a pointer to an unsigned char ... will this map be c_char_p? The ctypes docs say that all byte and unicode strings are passed as pointers anyway, so what provisions do I need to do for this?

  • I need to provide pOutBuffer to a function that needs to be a pointer to the location of the allocated memory to which the function can copy de / compressed data. I believe I should use create_string_buffer () to create a buffer of the appropriate size for this?

  • I also need to know how to define the Dict member [0x1000] , which looks like (to me) to create a 4096 byte buffer for internal use inside functions. I know my definition is clearly wrong, but not sure how to define it?

  • Should C functions be decorated like __stdcall or __cdecl? (I used the latter in some test DLLs as I was working up to this point).

Any feedback will be VERY highly appreciated!

Thanks in advance,



source to share

1 answer

If the data in the structure are pointers, you must declare them as a Python-side pointer.

One way to do this is to use a utility POINTER

in ctypes - this is a slightly higher level object than ctypes.c_char_p

(and not fully compatible with this), but your code will become more readable. Also, for modeling C arrays, the base types of ctypes can be multiplied by a scalar, and the returned object is one that can be used as a C vector of the base type of the same size - (so the Dict field can be defined as below, c_ubyte * 4096


Note that although char

equivalent c_ubyte

, int

equivalent c_int

instead of, c_uint

and similar for long


Your structure definition does not specify what the marked buffers are const

. If you pass a python string (immutable) and your library tries to change it, you get errors. Instead, you must pass the modified memory that is returned from create_string_buffer

, initialized with your string.

# Define the pkstream struct
class PKSTREAM(Structure):
   _fields_ = [('pInBuffer', POINTER(c_char)),
               ('nInSize', c_int),
               ('pOutBuffer', POINTER(c_char)),
               ('nOutSize', c_int),
               ('nLitSize', c_char),
               ('nDictSizeByte', c_char),
               ('pInPos', POINTER(c_char)),
               ('pOutPos', POINTER(c_char)),
               ('nBits', c_char),
               ('nBitBuffer', c_long),
               ('pDictPos', POINTER(c_char)),
               ('nDictSize', c_int),
               ('nCurDictSize', c_int),
               ('Dict', c_char * 0x1000)]


As for (5), I don't know how you should decorate your C functions - use whatever works.



All Articles