Executing data as C code

Using this answer (and this sequel ) as inspiration, I looked at ways to execute some functional C programs (for which there are already many interesting discussions on this site). I would like to know how and when one can use the approach taken in the linked code, cast a string to a function pointer and execute it. For example, on my machine (OSX 10.10, Darwin 14.0.0, GCC 4.8.3) I can compile and run

int eax = ((int(*)())("\xc3 <- This returns the value of the EAX register"))();

      

(always returns 0, which is what I expect if the program does nothing), but

#include <stdio.h>

int main() {
  const char* lol = "\x8b\x5c\x24\x4\x3d\xe8\x3\x0\x0\x7e\x2\x31\xc0\x83\xf8\x64\x7d\x6\x40\x53\xff\xd3\x5b\xc3\xc3 <- Recursively calls the function at address lol.";
  int i = ((int(*)())(lol))(lol);
  printf("i: %d\n",i);
  return 0;
}

      

segmentation errors. On the other hand, the code code successfully runs the second example giving the correct answeri: 100

.

When can you execute from strings? And is there a way to make this (relatively) consistent?

(I can reasonably assume that this behavior is undefined, and I know that I will be increasing unemployment globally using it.)

+3


source to share


2 answers


This is (legally) undefined behavior , and in practice it is implementation specific.

You need a few things to get this done successfully.

  • first, you need the machine code inside your literal string to be correct. Obviously it is the processor and the ABI . But I trust you.
  • then you depend on the protocol used to call the function pointer, that is, according to the ABI specification.
  • finally, on multiple processors (specifically x86-64) you need machine code in some executable segment. I guess this is usually not the case (but it may be operating system specific). Learn more about NX bit and ASLR (and also PIC ). Sometimes this can be circumvented, for example. appropriately mmap

    - in some segments with execute permissions and copy machine code there.


By the way, you might be interested in JIT Methods and Libraries ( libjit , lightning , asmjit , LLVM ...)

As DCoder commented, read more about shellcode and more commonly code injection

A more portable approach might be (as I did in MELT ) to generate some C (or C ++) code on the fly, formatting the compilation of that code into a shared object, and then dlopen

for making the shared object (& dlsym

-ing match ).

+4


source


Generally speaking, the contents of string literals on Linux and OSX are stored in a read-only segment, which is also executable (this may not necessarily be the case on Windows or other platforms). This is why you can do things like

(L"\xfeeb")();

      

on x86 and x86_64 Linux and OSX and don't get compiler error. However, if the machine language instructions that you enter into the string literal do not meet the requirements of how functions should be structured to suit your operating system and hardware platform, you are likely to run into a segfault. An executable string literal that works on Linux Aarch64 may not work on OSX on x86_64 and vice versa.



If you want to learn how to programmatically generate executable machine code, you can (in POSIX) allocate a region of executable memory with a function mmap()

, put your code there, and experiment with your heart.

At some point, you may find it disassemble <addr>,+<range>

helpful at gdb

and disassemble --start-address <addr> --end-address <addr>

helpful at lldb

.

0


source







All Articles