JIT transitions (x86_64)
I am writing a JIT compiler in C for x86_64 linux.
Currently the idea is to generate some bytecode in the executable memory buffer (such as obtained by calling mmap) and navigate to it using a function pointer.
I would like to be able to link multiple blocks of executable memory together so that they can jump among themselves using only their own instructions.
Ideally, a C-level pointer to an executable block could be written to another block as an absolute branch address, something like this:
unsigned char *code_1 = { 0xAB, 0xCD, ... };
void *exec_block_1 = mmap(code1, ... );
write_bytecode(code_1, code_block_1);
...
unsigned char *code_2 = { 0xAB, 0xCD, ... , exec_block_1, ... };
void *exec_block_2 = mmap(code2, ... );
write_bytecode(code_2, exec_block_2); // bytecode contains code_block_1 as a jump
// address so that the code in the second block
// can jump to the code in the first block
However, I found that the x86_64 limitations are a hindrance here. It is not possible to jump to an absolute 64-bit address in x86_64 because all available 64-bit jump operations refer to the instruction pointer. This means that I cannot use a C pointer as the jump target for the generated code.
Is there a solution to this problem that will allow me to link the blocks together the way I described? Perhaps an x86_64 instruction that I am not aware of?
source to share
Hmm, I'm not sure if I understood your question clearly and if this is the correct answer. this is a rather convoluted way to achieve this:
;instr ; opcodes [op size] (comment)
call next ; e8 00 00 00 00 [4] (call to get current location)
next:
pop rax ; 58 [1] (next label address in rax)
add rax, 12h ; 48 83 c0 12 [4] (adjust rax to fall on landing label)
push rax ; 50 [1] (push adjusted value)
mov rax, code_block ; 48 b8 XX XX XX XX XX XX XX XX [10] (load target address)
push rax ; 50 [1] (push to ret to code_block)
ret ; c3 [1] (go to code_block)
landing:
nop
nop
e8 00 00 00 00
is here to get the current pointer on top of the stack. The code then adjusts rax
to fall for the descriptor label later. You need to replace XX
(in mov rax, code_block
) with a virtual address code block
. The command is used as a call ret
. When the caller returns, the code should fall on landing
.
Is this what you are trying to achieve?
source to share
If you know the block addresses when the branch instructions are issued, you can simply check to see if the distance in bytes from the branch instruction address to the target block address is within the 32-bit instruction family offset offset jXX
.
Even if you take mmap
each block separately, the chances are pretty good that you won't end up with two adjacent (in terms of control flow) blocks that are more than ± 2GiB apart. That being said, there are several good reasons not to match each block separately. First off, the mmap
smallest allocation unit (almost by definition) is a page, which is probably at least 4KiB. This means that unused space after the code for each block will be wasted. Second, packing the basic blocks more rigidly increases the instruction cache usage, and the likelihood of shorter branch coding is valid.
Perhaps an x86_64 instruction that I am not aware of?
By the way, there is an instruction for downloading 64-bit immediate into rax
. The GNU chain refers to it as movabs
:
0000000000000000 <.text>:
0: 49 b8 ff ff ff ff ff movabs rax,0x7fffffffffffffff
7: ff ff 7f
So, if you really want to, you can just load the pointer in rax
and use the jump to register.
source to share