Differences of x86 and x86-64 machine code

2019-08-11 02:32发布

So, I've got a program which generates JIT x86 machine code and executes it directly and I want it to support x86-64/AMD64/x64 as well. The obvious differences are:

  • New registers (rax, r8...) and pointer width (pointers need to use 64bit regs)
  • Default C calling convention (arguments on stack vs. registers)
  • Some new mnemonics (pushq to push 64bit)

Are there any differences in the binary instructions as well or should it be (roughly) sufficient to use pushq and 64bit registers when appropriate and the code will just work?

Code example:

static inline void emit_call(uint32_t target) {
    emit_byte(0xE8);
    emit_dword(target - ((uint32_t)out + 4));
}

This would still work if I use uintptr_t instead of uint32_t I assume, but loading an immediate into a 64bit register rax is different from loading it into the lower 32bit alias:

static void emit_mov_x86reg_immediate(int x86reg, int imm) {
    emit_byte(0xB8 | x86reg);
    emit_dword(imm);
}

Are there any other differences?

The code I'm working on is accessible here, if you want to take a look at it.

1条回答
淡お忘
2楼-- · 2019-08-11 03:22

There's actually no difference between the old 32bit push and the new 64bit push, that's one of the few instructions that are implicitly 64bit.

Relative branches and calls still use 32bit offsets.

Some actual differences are:

  • the REX prefix, obviously, for extra registers (also remember sil and dil - a REX prefix with none of the bits set can still matter!)
  • the REX prefix again, it used to be the short encoding (40+rd, 48+rd) for inc and dec. So inc and dec must the FF /1 encoding.
  • rip-relative addressing
  • 64bit load immediate and mov with a direct 64bit address
  • sign-extend 32bit to 64bit (movsxd) shares opcode with arpl
  • les and lds don't exist, reused as VEX prefixes (in 32bit mode, only les and lds with illegal operands are VEX prefixes, which is why the encoding of VEX prefixes is a bit odd)
  • several old instructions that no one was using or are useless in 64bit mode were removed (decimal math, bound, into, pushad, pop es and friends)
  • the 82 /? aliases of 80 /? are no longer valid
  • lahf and sahf don't exist on some old x64 processors (not that you'd use them anyway..)
查看更多
登录 后发表回答