I am playing around with setting up my own runtime environment for an executable, and I can't get clang (v3.4-1ubuntu1, target: x86_64-pc-linux-gnu) to produce an executable that doesn't segfault. I have reduced the problem to the following:
If I have a file crt1.c that does nothing except satisfy the linker requirement for a _start symbol:
void
_start(char *arguments, ...)
{
}
Then I compile it with clang -nostdlib crt1.c
, it produces the following executable (from objdump -d a.out
):
a.out: file format elf64-x86-64
Disassembly of section .text:
0000000000400150 <_start>:
400150: 55 push %rbp
400151: 48 89 e5 mov %rsp,%rbp
400154: 48 81 ec f0 00 00 00 sub $0xf0,%rsp
40015b: 84 c0 test %al,%al
40015d: 0f 29 bd 30 ff ff ff movaps %xmm7,-0xd0(%rbp)
400164: 0f 29 b5 20 ff ff ff movaps %xmm6,-0xe0(%rbp)
40016b: 0f 29 ad 10 ff ff ff movaps %xmm5,-0xf0(%rbp)
400172: 0f 29 a5 00 ff ff ff movaps %xmm4,-0x100(%rbp)
400179: 0f 29 9d f0 fe ff ff movaps %xmm3,-0x110(%rbp)
400180: 0f 29 95 e0 fe ff ff movaps %xmm2,-0x120(%rbp)
400187: 0f 29 8d d0 fe ff ff movaps %xmm1,-0x130(%rbp)
40018e: 0f 29 85 c0 fe ff ff movaps %xmm0,-0x140(%rbp)
400195: 48 89 bd b8 fe ff ff mov %rdi,-0x148(%rbp)
40019c: 4c 89 8d b0 fe ff ff mov %r9,-0x150(%rbp)
4001a3: 4c 89 85 a8 fe ff ff mov %r8,-0x158(%rbp)
4001aa: 48 89 8d a0 fe ff ff mov %rcx,-0x160(%rbp)
4001b1: 48 89 95 98 fe ff ff mov %rdx,-0x168(%rbp)
4001b8: 48 89 b5 90 fe ff ff mov %rsi,-0x170(%rbp)
4001bf: 0f 84 5b 00 00 00 je 400220 <_start+0xd0>
4001c5: 0f 28 85 c0 fe ff ff movaps -0x140(%rbp),%xmm0
4001cc: 0f 29 85 70 ff ff ff movaps %xmm0,-0x90(%rbp)
4001d3: 0f 28 8d d0 fe ff ff movaps -0x130(%rbp),%xmm1
4001da: 0f 29 4d 80 movaps %xmm1,-0x80(%rbp)
4001de: 0f 28 95 e0 fe ff ff movaps -0x120(%rbp),%xmm2
4001e5: 0f 29 55 90 movaps %xmm2,-0x70(%rbp)
4001e9: 0f 28 9d f0 fe ff ff movaps -0x110(%rbp),%xmm3
4001f0: 0f 29 5d a0 movaps %xmm3,-0x60(%rbp)
4001f4: 0f 28 a5 00 ff ff ff movaps -0x100(%rbp),%xmm4
4001fb: 0f 29 65 b0 movaps %xmm4,-0x50(%rbp)
4001ff: 0f 28 ad 10 ff ff ff movaps -0xf0(%rbp),%xmm5
400206: 0f 29 6d c0 movaps %xmm5,-0x40(%rbp)
40020a: 0f 28 b5 20 ff ff ff movaps -0xe0(%rbp),%xmm6
400211: 0f 29 75 d0 movaps %xmm6,-0x30(%rbp)
400215: 0f 28 bd 30 ff ff ff movaps -0xd0(%rbp),%xmm7
40021c: 0f 29 7d e0 movaps %xmm7,-0x20(%rbp)
400220: 48 8b 85 b0 fe ff ff mov -0x150(%rbp),%rax
400227: 48 89 85 68 ff ff ff mov %rax,-0x98(%rbp)
40022e: 48 8b 8d a8 fe ff ff mov -0x158(%rbp),%rcx
400235: 48 89 8d 60 ff ff ff mov %rcx,-0xa0(%rbp)
40023c: 48 8b 95 a0 fe ff ff mov -0x160(%rbp),%rdx
400243: 48 89 95 58 ff ff ff mov %rdx,-0xa8(%rbp)
40024a: 48 8b b5 98 fe ff ff mov -0x168(%rbp),%rsi
400251: 48 89 b5 50 ff ff ff mov %rsi,-0xb0(%rbp)
400258: 48 8b bd 90 fe ff ff mov -0x170(%rbp),%rdi
40025f: 48 89 bd 48 ff ff ff mov %rdi,-0xb8(%rbp)
400266: 4c 8b 85 b8 fe ff ff mov -0x148(%rbp),%r8
40026d: 4c 89 45 f8 mov %r8,-0x8(%rbp)
400271: 48 81 c4 f0 00 00 00 add $0xf0,%rsp
400278: 5d pop %rbp
400279: c3 retq
The executable crashes with a segmentation fault at the instruction at address 40015d--the one that saves away %xmm7. I don't know why clang is saving these away, gcc produces no such instructions.
The value passed in %rbp is 7fffffffe588, which is not 16-byte aligned, which I guess in some sense explains the segmentation fault. But how would I get this to work? Get it to suppress those save instructions? Get it to align the rbp pointer somehow?
EDIT: I guess this problem comes down to the fact that the code that clang is producing assumes that %rsp is going to be 16-byte aligned. Is that a valid assumption to make? Why is it not true in this example?