This question already has an answer here:
I'm writing a JIT compiler in C for x86_64 linux.
Currently the idea is to generate some bytecode in a buffer of executable memory (e.g. obtained with an mmap call) and jump to it with a function pointer.
I'd like to be able to link multiple blocks of executable memory together such that they can jump between each other using only native instructions.
Ideally, the C-level pointer to an executable block can be written into another block as an absolute jump address something like this:
unsigned char *code_1 = { 0xAB, 0xCD, ... };
void *exec_block_1 = mmap(code1, ... );
write_bytecode(code_1, code_block_1);
...
unsigned char *code_2 = { 0xAB, 0xCD, ... , exec_block_1, ... };
void *exec_block_2 = mmap(code2, ... );
write_bytecode(code_2, exec_block_2); // bytecode contains code_block_1 as a jump
// address so that the code in the second block
// can jump to the code in the first block
However I'm finding the limitations of x86_64 quite an obstacle here. There's no way to jump to an absolute 64-bit address in x86_64 as all available 64-bit jump operations are relative to the instruction pointer. This means that I can't use the C-pointer as a jump target for generated code.
Is there a solution to this problem that will allow me to link blocks together in the manner I've described? Perhaps an x86_64 instruction that I'm not aware of?
Hmm I'm not sure if I clearly understood your question and if that a proper answer. it's quite a convoluted way to achieve this:
e8 00 00 00 00
is just there to get the current pointer on top of stack. Then the code adjustsrax
to fall on landing label later. You'll need to replace theXX
(inmov rax, code_block
) by the virtual address ofcode block
. Theret
instruction is used as a call. When caller returns, the code should fall onlanding
.Is that this kind of thing you're trying to achieve?
If you know the addresses of the blocks at the time when you are emitting the jump instructions, you can just check to see if the distance in bytes from the address of the jump instruction to the address of the target block fits within the 32-bit signed offset of the
jXX
family of instructions.Even if you
mmap
each block separately, chances are pretty good that you won't get two neighbouring (in the control-flow sense) blocks that are more than ±2GiB apart. That being said, there are several good reasons not to map each block separately like that. First of all,mmap
's minimum unit of allocation is (almost by definition) a page, which is probably at least 4KiB. That means that the unused space after the code for each block is wasted. Secondly, packing the basic blocks more tightly increases the utilization of the instruction cache and the chances of a shorter jump encoding being valid.Incidentally, there is an instruction for loading a 64-bit immediate into
rax
. The GNU toolchain refers to it asmovabs
:So if you really want to, you can simply load the pointer into
rax
and use a jump to register.