Why does an assembly program only work when linked

2020-02-28 08:43发布

问题:

I like to know how programs work so to make it as bare bones as possible I fool around with assembly.

I just found out how to assemble code for x86_64 using wprintf function (found out wide chars are 32 bit). all I had to do was link to libc (-lc).

I'm trying to assemble code for 32-bit doing about the same thing but I stumbled quite a bit. Eventually I used gcc to do the linking ( and changed the _start: to main:). So then I did the linking myself using ld and included crt1.o crti.o and crtn.o. Then my program worked ( it wouldn't print out anything before ) So my question is, can I do something within my code to eliminate the need for these other 3 object files (and of course revert back to _start: instead of main:)?

test_lib.S

.section .data
locale:
  .string ""
  .align 4
printformat:
  .long '%','l','c',0

.section .text
.global main
main:

pushl   $locale
pushl   $6
call    setlocale
pushl   $12414
pushl   $printformat
call    wprintf
pushl   $2
call    exit

and running the following

as --32 test_lib.S -o test_lib.o
ld -m elf_i386 -L/lib/ -L/usr/lib/ -I/lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o -lc /usr/lib/crtn.o test_lib.o -o test_lib
./test_lib

oh and the output is simply a japanese hiragana (ma)ま (notice there is no line break so it prints before the prompt)

回答1:

Here are what the files do for you. They are the c-runtime environment and setup that link to the OS.

  • crt1.o Newer style of the initial runtime code. Contains the _start symbol which sets up the env with argc/argv/libc _init/libc _fini before jumping to the libc main. glibc calls this file 'start.S'.

  • crti.o Defines the function prolog; _init in the .init section and _fini in the .fini section. glibc calls this 'initfini.c'.

  • crtn.o Defines the function epilog. glibc calls this 'initfini.c'.

There is an excellent write up and example code to be found at the following website http://wiki.osdev.org/Creating_a_C_Library for each of the libraries above.



回答2:

In 3-bit Linux, they're on the stack. At start up, 0(%esp) contains argc. At 4(%esp), you'll find a pointer to the program name (which is included in argc). After that comes an array of pointers to the arguments, concluded with a NULL pointer. After that comes another NULL-concluded array of pointers to system variables. I am told that there may be over 200 of them!!

Shiarta