Following up Why is the ELF execution entry point virtual address of the form 0x80xxxxx and not zero 0x0? and Why do virtual memory addresses for linux binaries start at 0x8048000?, why cannot I make ld
use a different entry point than the default with ld -e
?
If I do so, I either get a segmentation fault
with return code 139, even for addresses close by the default entry point. Why?
EDIT:
I will make the question more specific:
.text
.globl _start
_start:
movl $0x4,%eax # eax = code for 'write' system call
movl $1,%ebx # ebx = file descriptor to standard output
movl $message,%ecx # ecx = pointer to the message
movl $13,%edx # edx = length of the message
int $0x80 # make the system call
movl $0x0,%ebx # the status returned by 'exit'
movl $0x1,%eax # eax = code for 'exit' system call
int $0x80 # make the system call
.data
.globl message
message:
.string "Hello world\n" # The message as data
If I compile this with as program.s -o program.o
and then link it statically with ld -N program.o -o program
, readelf -l program
shows 0x0000000000400078
as the VirtAddr
of the text segment and 0x400078
as entry point. When run, `Hello world" is printed.
However, when I try to link with ld -N -e0x400082 -Ttext=0x400082 program.o -o program
(moving text segment and entry point by 4 bytes), the program will be killed
. Inspecting it with readelf -l
now shows two different headers of type LOAD
, one at 0x0000000000400082
and one at 0x00000000004000b0
.
When I try 0x400086
, it all works, and there is only one LOAD
section.
- What's going on here?
- Which memory addresses may I chose, which ones cannot I chose and why?
Thanks you.
why cannot I make ld use a different entry point than the default with ld -e
You sure can. This:
int foo(int argc, char *argv[]) { return 0; }
gcc main.c -Wl,-e,foo
wouldn't work, because the execution doesn't start at main. It starts at _start
, which is linked from crt0.o
(part of glibc) and arranges for things like dynamic linking, etc. to start up properly. By redirecting _start
to foo
, you've bypassed all that required glibc initialization, and so things don't work.
But if you don't need dynamic linking, and are willing to do what glibc normally does for you, then you can name the entry point whatever you want. Example:
#include <syscall.h>
int foo()
{
syscall(SYS_write, 1, "Hello, world\n", 13);
syscall(SYS_exit, 0);
}
gcc t.c -static -nostartfiles -Wl,-e,foo && ./a.out
Hello, world
Oh, and your title of this question doesn't match your actual question (bad idea(TM)).
To answer the question in the title, you sure can change the address your executable is linked at. By default, you get 0x8048000
load address (only in 32-bits; 64-bit default is 0x400000
).
You can easily change that to e.g. 0x80000
by adding -Wl,-Ttext-segment=0x80000
to the link line.
Update:
However, when I try to link with ld -N -e0x400082 -Ttext=0x400082 program.o -o program (moving text segment and entry point by 4 bytes), the program will be killed.
Well, it is impossible to assign Ttext
to 0x400082
without violating .text
section alignment constraint (which is 4). You must keep the .text address aligned on at least 4-byte boundary (or change the required alignment of .text
).
When I set the start address to 0x400078, 0x40007c, 0x400080, 0x400084, ..., 0x400098 and use GNU-ld 2.20.1, the program works.
However, when I use current CVS snapshot of binutils, the program works for 0x400078, 0x40007c, 0x400088, 0x40008c, and gets Killed for 0x400080, 0x400084, 0x400090, 0x400094, 0x400098. This might be a bug in the linker, or I am violating some other constraint (I don't see which though).
At this point, if you are really interested, I suggest downloading binutils sources, building ld
, and figuring out what exactly causes it to create two PT_LOAD
segments instead of one.
Update 2:
Force new segment for sections with overlapping LMAs.
Ah! That just means you need to move .data
out of the way. This makes a working executable:
ld -N -o t t.o -e0x400080 -Ttext=0x400080 -Tdata=0x400180