x86 CMP Instruction Difference

2019-02-16 10:17发布

问题:

Question

What is the (non-trivial) difference between the following two x86 instructions?

39 /r    CMP r/m32,r32   Compare r32 with r/m32
3B /r    CMP r32,r/m32   Compare r/m32 with r32

Background

I'm building a Java assembler, which will be used by my compiler's intermediate language to produce Windows-32 executables.

Currently I have following code:

final ModelBase mb = new ModelBase(); // create new memory model
mb.addCode(new Compare(Register.ECX, Register.EAX)); // add code
mb.addCode(new Compare(Register.EAX, Register.ECX)); // add code

final FileOutputStream fos = new FileOutputStream(new File("test.exe"));
mb.writeToFile(fos);
fos.close();

To output a valid executable file, which contains two CMP instruction in a TEXT-section. The executable outputted to "text.exe" will do nothing interesting, but that's not the point. The class Compare is a wrapper around the CMP instruction.

The above code produces (inspecting with OllyDbg):

Address   Hex dump                 Command
0040101F  |.  3BC8                 CMP ECX,EAX
00401021  |.  3BC1                 CMP EAX,ECX

The difference is subtle: if I use the 39 byte-opcode:

Address   Hex dump                 Command
0040101F  |.  39C1                 CMP ECX,EAX
00401021  |.  39C8                 CMP EAX,ECX

Which makes me wonder about their synonymity and why this even exists.

回答1:

It doesn't matter which opcode you use if you compare two registers. The only difference is when comparing a register with a memory operand, as the opcode used determines which will be subtracted from which.

As for why this exists: The x86 instruction format uses the ModR/M byte to denote either a memory address or a register. Each instruction can only have one ModR/M value, which means it can only access one memory address (not including special instructions like MOVSB). So this means that there can't be a general cmp r/m32, r/m32 instruction, and we need two different opcodes: cmp r/m32, r32 and cmp r32, r/m32. As a side effect, this creates some redundancy when comparing two registers.



回答2:

It's redundancy of x86. There are much more many cases like this. A compiler/assembler is free to use any of the valid opcodes

Some assembler allows you to choose which opcode to emit. For example on GAS you can attach ".s" to use the other instruction encoding

10 de   adcb   %bl,%dh
12 f3   adcb.s %bl,%dh


回答3:

CMP ECX,EAX is ECX-EAX and CMP EAX,ECX is EAX-ECX. The flags are set differently depending on which operand is compared to which. Of course you probably could get away with only one of them if it weren't for the mod/r-m structure of x86 instructions.