Self-Modifying Program View in C
Is it possible to write a C function that does the following?
- Allocate a bunch of memory on the heap
- Writes machine code in it
- Follows machine instructions
Of course, I would have to restore the stack state to what it was before executing these machine instructions manually, but I want to know if this is possible in the first place.
source to share
This is certainly possible. For various reasons, we have spent a lot of effort over the past 30-40 years trying to make it as difficult as possible, but it is possible. Most systems now have hardware and software mechanisms that try to protect the data space from execution.
The basics, however, are pretty simple: you create a piece of code and put it together either by hand or4 through a compiler. Then you need a piece of code, so you paste the code into your program
unsigned int prgm[] = { 0x0F, 0xAB, 0x9A ... }; // Random numbers, just as an example
since you wanted to use heap you need malloc space
void * myspace ;
if((myspace= malloc(sizeof(prgm))) != NULL) {
memcpy(myspace, pgrm, sizeof(pgrm));
} else { // allocation error
}
Now you need a way to make the program counter point to this piece of data, which is also your piece of code. Here's where you need a little trickery. Setting the program counter is not a big deal; it's just a JUMP instruction for your base machine. But how to do that?
One of the easiest ways is to purposefully fiddle with glass. The stack, again conceptually, looks something like this (the details depend on both your OS and the compiler pairs and your hardware):
| subroutine return addr | | parameters ... | | automatic variables |
The main trick here is to eloquently get the address of your code into the return address; when the procedure returns, it basically goes to that returned by addrfess. If you can fake it, the PC will be installed wherever you like.
So you need a routine, call it "goThere ()"
void goThere(void * addr){
int a ; // observe above; this is the first space
// on the stack following the parameters
int * pa; // so we use it address
pa = (&a - (sizeof(int)+(2*sizeof(void*))) ; // so use the address
// but back up by the size of an int, the pointer on the
// stack, and the return address
// Now 'pa' points to the routine return add on the stack.
*pa = addr; // sneak the address of the new code into return addr
return ; // and return, tricking it into "returning"
// to the address of your special code block
}
Will this work? Well, maybe depending on hardware and OS. Most modern OSes will protect the heap (by memory mapping or similar) from the PC moving into it. This is useful for security, because we will not easily control this process.
source to share
Read the calling code stored on the heap from vC ++ . At posix mprotect
seems to fit (see man mprotect
):
char *mem = malloc(sizeof(code));
mprotect(mem, sizeof(code), PROT_READ|PROT_WRITE|PROT_EXEC);
memcpy(mem, code, sizeof(code));
// now arrange some code to jump to mem. But read the notes here on casting
// from void* to a function pointer:
// http://www.opengroup.org/onlinepubs/009695399/functions/dlsym.html
However, he says:
Whether PROT_EXEC has an effect other than PROT_READ is architecture and kernel specific. On some hardware architectures (eg i386) PROT_WRITE implies PROT_READ.
Better that way, first check if your operating system is running.
source to share
RE: manual stack recovery
If you follow the calling conventions used by your platform / compiler inside the machine code you generate, you don't need to do manual repair. The compiler will do it for you when you do
* pfunc (arg)
it should add any appropriate stack handling steps before or after the call as needed.
Just make sure you follow the correct conventions within the generated code.
source to share