C struct => ctypes struct ... is this mapping correct?
I'm trying to access two legacy de / compression functions from Python, which are written in C and are currently available via DLLs (I have a C source).
Functions are passed in a (partially) filled C structure and use this information to compress or decompress the data in the provided buffer.
This is what functions are called. I added __cdecl for Python compatibility.
// Both functions return 0 on success and nonzero value on failure
int __cdecl pkimplode(struct pkstream *pStr);
int __cdecl pkexplode(struct pkstream *pStr);
Here's a pkstream structure defined in C:
struct pkstream {
unsigned char *pInBuffer; // Pointer to input buffer
unsigned int nInSize; // Size of input buffer
unsigned char *pOutBuffer; // Pointer to output buffer
unsigned int nOutSize; // Size of output buffer upon return
unsigned char nLitSize; // Specifies fixed or var size literal bytes
unsigned char nDictSizeByte; // Dictionary size; either 1024, 2048, or 4096
// The rest of the members of this struct are used internally,
// so setting these values outside pkimplode or pkexplode has no effect
unsigned char *pInPos; // Current position in input buffer
unsigned char *pOutPos; // Current position in output buffer
unsigned char nBits; // Number of bits in bit buffer
unsigned long nBitBuffer; // Stores bits until enough to output a byte
unsigned char *pDictPos; // Position in dictionary
unsigned int nDictSize; // Maximum size of dictionary
unsigned int nCurDictSize; // Current size of dictionary
unsigned char Dict[0x1000]; // Sliding dictionary used for compdecomp
};
This is my attempt at mirroring this structure in Python.
# Define the pkstream struct
class PKSTREAM(Structure):
_fields_ = [('pInBuffer', c_ubyte),
('nInSize', c_uint),
('pOutBuffer', c_ubyte),
('nOutSize', c_uint),
('nLitSize', c_ubyte),
('nDictSizeByte', c_ubyte),
('pInPos', c_ubyte),
('pOutPos', c_ubyte),
('nBits', c_ubyte),
('nBitBuffer', c_ulong),
('pDictPos', c_ubyte),
('nDictSize', c_uint),
('nCurDictSize', c_uint),
('Dict', c_ubyte)]
I would really appreciate help with the following questions (which I prefer to ask questions on the interface rather than just "close" it, hopefully for obvious reasons):
-
I'm not sure whether to use c_ubyte, c_char, or c_char_p for unsigned char members . c_ubyte most closely matches ctypes for unsigned char (according to the docs, at least), but really is? int / long? in Python.
-
Sometimes a member is a pointer to an unsigned char ... will this map be c_char_p? The ctypes docs say that all byte and unicode strings are passed as pointers anyway, so what provisions do I need to do for this?
-
I need to provide pOutBuffer to a function that needs to be a pointer to the location of the allocated memory to which the function can copy de / compressed data. I believe I should use create_string_buffer () to create a buffer of the appropriate size for this?
-
I also need to know how to define the Dict member [0x1000] , which looks like (to me) to create a 4096 byte buffer for internal use inside functions. I know my definition is clearly wrong, but not sure how to define it?
-
Should C functions be decorated like __stdcall or __cdecl? (I used the latter in some test DLLs as I was working up to this point).
Any feedback will be VERY highly appreciated!
Thanks in advance,
James
source to share
If the data in the structure are pointers, you must declare them as a Python-side pointer.
One way to do this is to use a utility POINTER
in ctypes - this is a slightly higher level object than ctypes.c_char_p
(and not fully compatible with this), but your code will become more readable. Also, for modeling C arrays, the base types of ctypes can be multiplied by a scalar, and the returned object is one that can be used as a C vector of the base type of the same size - (so the Dict field can be defined as below, c_ubyte * 4096
)
Note that although char
equivalent c_ubyte
, int
equivalent c_int
instead of, c_uint
and similar for long
.
Your structure definition does not specify what the marked buffers are const
. If you pass a python string (immutable) and your library tries to change it, you get errors. Instead, you must pass the modified memory that is returned from create_string_buffer
, initialized with your string.
POINTER = ctypes.POINTER
# Define the pkstream struct
class PKSTREAM(Structure):
_fields_ = [('pInBuffer', POINTER(c_char)),
('nInSize', c_int),
('pOutBuffer', POINTER(c_char)),
('nOutSize', c_int),
('nLitSize', c_char),
('nDictSizeByte', c_char),
('pInPos', POINTER(c_char)),
('pOutPos', POINTER(c_char)),
('nBits', c_char),
('nBitBuffer', c_long),
('pDictPos', POINTER(c_char)),
('nDictSize', c_int),
('nCurDictSize', c_int),
('Dict', c_char * 0x1000)]
As for (5), I don't know how you should decorate your C functions - use whatever works.
source to share