My (AT&T) assembly (x86-x64) code should increment

2019-07-15 09:03发布

问题:

I'm trying to make a small program in assembly (for AT&T). I'm trying to get an input from the user in the form of an integer, increment it after that and then output the incremented value. However, the value doesn't increment. I've spent the last hours trying everything I could come up with, but it still doesn't work, so I have the idea that I maybe understand a concept in assembly not well, causing me to not spot the mistake. This is my code:

1 hiString: .asciz "Hi\n"
  2 formatstr: .asciz "%ld"
  3 
  4 .global main
  5 
  6 main:
  7     movq $0, %rax           #no vector registers printf
  8     movq $hiString, %rdi    #load hiString
  9     call printf             #printf
 10     call inout              #inout
 11     movq $0, %rdi           #loading exit value into register rdi
 12     call exit               #exit
 13 
 14 inout:
 15     pushq %rbp              #Pushing bp
 16     movq %rsp, %rbp         #Moving sp to bp
 17     subq $8, %rsp           #Space on stack for variable
 18     leaq -8(%rbp), %rsi
 19     movq $formatstr, %rdi   #1st argument scanf
 20     movq $0, %rax           #no vector for scanf registers
 21     call scanf              #scanf
 22     incq %rsi
 23     call printf

From a tutorial I got of a friend of mine, I learned that lines 17 to 19 are necessary, however, I think I don't use the stack space I adress there, so I suspect the error having something with that. I'm not sure ofcourse. Thank you in advance.

EDIT, UPDATED CODE (printf is still called in the subroutine now)

    1 hiString: .asciz "hi\n"
  2 formatstr: .asciz "%ld"
  3 
  4 .global main
  5 
  6 main:
  7     movq $0, %rax          
  8     movq $hiString, %di   
  9     call printf             
 10     call inout              
 11     movq $0, %rdi           
 12     call exit               
 13 
 14 inout:
 15     pushq %rbp             
 16     movq %rsp, %rbp         
 17     subq $8, %rsp         
 18     leaq -8(%rbp), %rsi
 19     movq $formatstr, %rdi   
 20     movq $0, %rax           
 21     call scanf              
 22     popq %rax
 23     incq %rax
 24     movq %rax, %rsi
 25     movq $0, %rax
 26     call printf
 27     addq $8, %rs  

It runs and increments now, however, when the incremented value is outputed, there show up some weird signs after the value.

Edit: Nevermind, the above only happened once, now there is no incremented value outputted, only weird signs.

回答1:

This is an assembly-level version of the classic confusion about how to call scanf correctly.

 14 inout:
 15     pushq %rbp              #Pushing bp
 16     movq %rsp, %rbp         #Moving sp to bp
 17     subq $8, %rsp           #Space on stack for variable
 18     leaq -8(%rbp), %rsi
 19     movq $formatstr, %rdi   #1st argument scanf
 20     movq $0, %rax           #no vector for scanf registers
 21     call scanf              #scanf

Up to this point your code is correct (except that you haven't aligned the stack correctly, but don't worry about that right now, scanf will probably let you get away with it).

 22     incq %rsi

Here's where you go wrong. Before the call you set RSI (the second argument register for scanf) to be a pointer to a storage location. scanf read a number from stdin and wrote it to that storage location, not to RSI.

From the discussion in the comments, your intention is to add one to the value read by scanf and immediately print it back out. As several other people pointed out, after scanf returns, you cannot assume that the values you loaded into RSI, RDI, or RAX are intact. (The x86-64 psABI specifies which registers are to be preserved over a function call: of the integer registers, only RBX, RBP, and R12 through R15 are preserved. You should read this document cover to cover if you intend to do much assembly programming on x86-64. (Caution: Windows uses a different ABI which is not, to my knowledge, documented anywhere.)) So you must set up the call to printf from scratch:

       movq -8(%rbp), %rsi   # load variable as arg 2 of printf
       incq %rsi             # and add one
       movq $formatstr, %rdi # first argument to printf
       xorl %rax, %rax       # no vector args to printf
       call printf

Pay close attention to the difference between scanf and printf here: you can use the same format string for both, but when you call scanf you pass the address of a storage location (leaq -8(%rbp), %rsi), whereas when you call printf you pass the value to be printed (movq -8(%rbp), %rsi; incq %rsi).

(In fact you ought to use a slightly different format string when you call printf, because you need to print a newline after the number, so "%ld\n" would be better.)

Your current code does almost this, in a different way. I do it this way because it's bad practice to mess with the stack pointer (popq %rax) in the middle of a function. (Remember what I said about not aligning the stack correctly? It's much easier to keep the stack aligned if you set up a complete "call frame" on entry and then leave the stack pointer alone until exit. Technically you are only required to have the stack pointer aligned at the point of each call instruction, though.)

You also don't end the function correctly:

 27     addq $8, %rs  

I think you didn't copy and paste your entire program - this looks like it's been cut off in the middle of the line. Regardless, if you're going to bother having a frame pointer in the first place (frame pointers are not required on x86-64) you should use it again to exit:

        movq %rbp, %rsp
        popq %rbp
        ret

Incidentally, "AT&T" assembly syntax is used for many different CPU architectures. When talking about assembly language we always need to know the CPU architecture first; the syntax variant (if any) is secondary. You should have titled the question "My assembly program (x86-64, AT&T syntax) ..."

As a final piece of advice, I would suggest you compile this C program

#include <stdio.h>

static void inout(void)
{
    long x;
    scanf("%ld", &x);
    printf("%ld\n", x+1);
}

int main(void)
{
    printf("hi\n");
    inout();
    return 0;
}

with your choice of C compiler, using options equivalent to -S -O2 -fno-inline (that is: generate textual assembly language, optimized, but don't do any inlining) and then read through the assembly output line by line. Whenever the C compiler does something different than you did, that probably means it knows something you don't know and you should learn about that something.



回答2:

re: updated code:

It runs and increments now, however, when the incremented value is outputed, there show up some weird signs after the value.

Arg-passing registers are call-clobbered. You call printf without putting the format-string into %rdi, which you have to assume holds garbage after scanf returns.

Single-step your code with a debugger. Use ni to step over calls in gdb. (See the bottom of the x86 tag wiki for GDB tips).