How cmp assembly instruction sets flags (X86_64 GN

2019-05-06 04:11发布

问题:

Here is a simple C program:

void main()
{
       unsigned char number1 = 4;
       unsigned char number2 = 5;

       if (number1 < number2)
       {
               number1 = 0;
       }
}

So here we are comparing two numbers. In assembly it will be done using cmp. cmp works by subtracting one operand from other.

Now how cmp is subtracting operands? Is it subtracting 1st operand from 2nd or vice versa? In any case, this should go like this:

case # 1:

4 - 5 = (0000 0100 - 0000 0101) = (0000 0100 + 1111 1010 + 1) = (0000 0100 + 1111 1011)

= 1111 1111 = -1

So since the sign bit = 1 so SF should be 1.

No carry, so CF should be = 0.

case # 2:

5 - 4 = (0000 0101 - 0000 0100) = (0000 0101 + 1111 1011 + 1)

= (0000 0101 + 1111 1100) = 1 0000 0001

so here, CF should be = 1

since result is positive, SF should be = 0

Now I compile and run program (linux x86_64, gcc, gdb), place a breakpoint after cmp instruction to see register states.

Breakpoint hit after cmp:

Breakpoint 2, 0x0000000000400509 in main ()
(gdb) disassemble
Dump of assembler code for function main:
   0x00000000004004f6 <+0>:     push   %rbp
   0x00000000004004f7 <+1>:     mov    %rsp,%rbp
   0x00000000004004fa <+4>:     movb   $0x4,-0x2(%rbp)
   0x00000000004004fe <+8>:     movb   $0x5,-0x1(%rbp)
   0x0000000000400502 <+12>:    movzbl -0x2(%rbp),%eax
   0x0000000000400506 <+16>:    cmp    -0x1(%rbp),%al
=> 0x0000000000400509 <+19>:    jae    0x40050f <main+25>
   0x000000000040050b <+21>:    movb   $0x0,-0x2(%rbp)
   0x000000000040050f <+25>:    pop    %rbp
   0x0000000000400510 <+26>:    retq
End of assembler dump.

Register dump after cmp has been executed:

(gdb) info reg
rax            0x4  4
rbx            0x0  0
rcx            0x0  0
rdx            0x7fffffffe608   140737488348680
rsi            0x7fffffffe5f8   140737488348664
rdi            0x1  1
rbp            0x7fffffffe510   0x7fffffffe510
rsp            0x7fffffffe510   0x7fffffffe510
r8             0x7ffff7dd4dd0   140737351863760
r9             0x7ffff7de99d0   140737351948752
r10            0x833    2099
r11            0x7ffff7a2f950   140737348041040
r12            0x400400 4195328
r13            0x7fffffffe5f0   140737488348656
r14            0x0  0
r15            0x0  0
rip            0x400509 0x400509 <main+19>
eflags         0x297    [ CF PF AF SF IF ]
cs             0x33 51
ss             0x2b 43
ds             0x0  0
es             0x0  0
fs             0x0  0
gs             0x0  0
(gdb)

So we can see that after cmp has been executed, both CF=1, SF=1.

So the actual resulted flags (CF=1 & SF=1) are not equal to flags we calculated in

Case # 1 (CF=0 & SF=1) or case # 2 (CF=1 & SF=0)

Whats happening then? How cmp is actually setting the flags?

回答1:

Operation of CMP
CMP performs a subtraction but does not store the result.
For this reason the effect on the flags is exactly the same between:

cmp eax,ecx
sub eax,ecx

As per the documentation:

Operation
temp ← SRC1 − SignExtend(SRC2);
ModifyStatusFlags; (* Modify status flags in the same manner as the SUB instruction*)
Flags Affected
The CF, OF, SF, ZF, AF, and PF flags are set according to the result.

Effects on the flags
So the following flags are affected like so:

Assume result = op1 - op2

CF - 1 if unsigned op2 > unsigned op1
OF - 1 if sign bit of OP1 != sign bit of result
SF - 1 if MSB (aka sign bit) of result = 1
ZF - 1 if Result = 0 (i.e. op1=op2)
AF - 1 if Carry in the low nibble of result
PF - 1 if Parity of Least significant byte is even

I suggest you read up on the OF and CF here: http://teaching.idallen.com/dat2343/10f/notes/040_overflow.txt

Order of the operands
I see that you like pain and are using the braindead variant of x86 assembly called ATT-syntax.
This being the case you need to take into account that

CMP %EAX, %ECX  =>  result for the flags = ECX - EAX
CMP OP2, OP1    =   flags = OP1 - OP2

Whereas Intel syntax is

CMP ECX, EAX    =>  result for the flags = ECX - EAX
CMP OP1, OP2    =>  flags = OP1 - OP2

You can instruct gdb to show you Intel syntax using: set disassembly-flavor intel



回答2:

I think i understand it now. This is how i think it goes (borrow flag is set)

4 - 5

1st operand = 4 = 0000 0100
2nd operand = 5 = 0000 0101

So we have to perform

      1st operand
    - 2nd operand
    --------------


      7654 3210 <-- Bit number
      0000 0100
    - 0000 0101
    ------------

Lets start.

Bit 0 of 1st operand = 0
Bit 0 of 2nd operand = 1

so

  0
- 1 
 ===
  ?

to do this,

let's borrow a 1 from left side of bit 0 of 1st operand.

so we see bit 2 of 1st operand is 1.

when bit 2 is = 1, it means 4.

we know that we can write 4 as 2 + 2. So we can write 4 as two 2s.

      7654 3210 <-- Bit number
             1
             1         
      0000 0000
    - 0000 0101
    ------------

So in above step, we have written bit 4 of 1st operand as two 2s (two 1 on top of bit 2 of 1st operand.)

Now again as we know, a 2 can be written as two 1s. So we borrow one 1 from bit 1 of 1st operand and write two 1s on bit 0 of 1st operand.

      7654 3210 <-- Bit number
              1
             11         
      0000 0000
    - 0000 0101
    ------------

Now we are ready to perform subtraction on bit 0 and bit 1.

      7654 3210 <-- Bit number
              1
             11         
      0000 0000
    - 0000 0101
    ------------
             11

So after solving bit 0 and bit 1, lets see bit 2.

We again see same problem.

Bit 2 of 1st operand = 0

Bit 2 of 2nd operand = 1

to do this, let's borrow a 1 from left side of bit 2 of 1st operand.

    8 7654 3210 <-- Bit number
              1
             11         
    1 0000 0000
    - 0000 0101
    ------------
             11

Now you see, bit 8 of 1st operand is 1. We have borrowed this 1.

At this stage, carry flag will be set. So CF=1.

Now, if bit 8 is 1, it means 256.

256 = 128 + 128

if bit 7 is 1, it means 128. We can rewrite as

    8 7654 3210 <-- Bit number
      1       1
      1      11         
      0000 0000
    - 0000 0101
    ------------
             11

As previously, we can re-write it as:

    8 7654 3210 <-- Bit number
       1      1
      11     11         
      0000 0000
    - 0000 0101
    ------------
             11

As previously, we can re-write it as:

    8 7654 3210 <-- Bit number
        1     1
      111    11         
      0000 0000
    - 0000 0101
    ------------
             11

As previously, we can re-write it as:

    8 7654 3210 <-- Bit number
         1    1
      1111   11         
      0000 0000
    - 0000 0101
    ------------
             11

As previously, we can re-write it as:

    8 7654 3210 <-- Bit number
           1  1
      1111 1 11         
      0000 0000
    - 0000 0101
    ------------
             11

As previously, we can re-write it as:

    8 7654 3210 <-- Bit number
            1 1
      1111 1111         
      0000 0000
    - 0000 0101
    ------------
             11

At last we can solve this.

Subtracting 2nd operand from all above it will give

    8 7654 3210 <-- Bit number
            1 1
      1111 1111         
      0000 0000
    - 0000 0101
    ------------
      1111 1111


So result = 1111 1111

Notice, sign bit in result = bit 7 = 1

so sign flag will be set. i.e SF=1

And therefore SF=1, CF=1 after 4 - 5