I'm compiling a 32 bit binary but want to embed some 64 bit assembly in it.
void method() {
asm("...64 bit assembly...");
}
Of course when I compile I get errors about referring to bad registers because the registers are 64 bit.
evil.c:92: Error: bad register name `%rax'
Is it possible to add some annotations so gcc will process the asm sections using the 64bit assembler instead. I have a workaround which is compile separately, map in a page with PROT_EXEC|PROT_WRITE and copy in my code but this is very awkward.
No, this isn't possible. You can't run 64-bit assembly from a 32-bit binary, as the processor will not be in long mode while running your program.
Copying 64-bit code to an executable page will result in that code being interpreted incorrectly as 32-bit code, which will have unpredictable and undesirable results.
Switching between long mode and compatibility mode is done by changing CS. User mode code cannot modify the descriptor table, but it can perform a far jump or far call to a code segment that is already present in the descriptor table. In Linux the required descriptor is present (in my experience; this may not be true for all installations).
Here is sample code for 64-bit Linux (Ubuntu) that starts in 32-bit mode, switches to 64-bit mode, runs a function, and then switches back to 32-bit mode. Build with gcc -m32.
Don't try to put 64-bit machine-code inside a compiler-generated function. It might work since the encoding for function prologue/epilogue is the same in 32 and 64-bit, but it would be cleaner to just have a separate block of 64-bit code.
The easiest thing is probably to assemble that block in a separate file, using GAS
.code64
or NASMBITS 64
to get 64-bit code in an object file you can link into a 32-bit executable.You said in a comment you're thinking of using this for a kernel exploit against a 64-bit kernel from a 32-bit user-space process, so you just need some code bytes in an executable part of your process's memory and a way to get a pointer to that block. This is certainly plausible; if you can gain control of the kernel's RIP from a 32-bit process, this is what you want, because kernel code will always be running in long mode.
If you were doing something with 64-bit userspace code in a process that started in 32-bit mode, you could maybe
far jmp
to the block of 64-bit code (as @RossRidge suggests), using a known value for the kernel's__USER_CS
64-bit code segment descriptor.syscall
from 64-bit code should return in 64-bit mode, but if not, try theint 0x80
ABI. It always returns to the mode you were in, saving/restoringcs
andss
along withrip
andrflags
. (What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?).rodata
is part of the test segment of your executable, so just get the compiler to put bytes in aconst
array. Fun fact:const int main = 195;
compiles to a program that exits without segfaulting, because195
=0xc3
= the x86 encoding forret
(and x86 is little-endian). For an arbitrary-length machine-code sequence,const char funcname[] = { 0x90, 0x90, ..., 0xc3 }
will work. Theconst
is necessary, otherwise it will go in.data
(read/write/noexec) instead of.rodata
.You could use
const char funcname[] __attribute__((section(".text"))) = { ... };
to control what section it goes in (e.g..text
along with compiler-generated functions), or even a linker script to get more control.If you really want to do it all in one
.c
file, instead of using the easier solution of a separately-assembled pure asm source:To assemble some 64-bit code along with compiler-generated 32-bit code, use the
.code64
GAS directive in anasm
statement *outside of any functions. IDK if there's any guarantee on what section will be active when gcc emits your asm how gcc will mix that asm with its asm, but it won't put it in the middle of a function.This compiles and assembles with gcc and clang (compiler explorer).
I tried it on my desktop to double check:
This is the correct encoding for
inc %r10d
:)The program also works when compiled without
-m32
, because I used#ifdef
to decide whether to use.code32
at the end or not. (There's no push/pop mode directive like there is for sections.)Of course, disassembling the binary will show you:
because the disassembler doesn't know to switch to 64-bit disassembly for that block. (I wonder if ELF has attributes for that... I didn't use any assembler directives or linker scripts to generate such attributes, if such a thing exists.)