I am wondering if it is legal to return with ret
from a program's entry point.
Example with NASM:
section .text
global _start
_start:
ret
; Linux: nasm -f elf64 foo.asm -o foo.o && ld foo.o
; OS X: nasm -f macho64 foo.asm -o foo.o && ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
ret
pops a return address from the stack and jumps to it.
But are the top bytes of the stack a valid return address at the program entry point, or do I have to call exit?
Also, the program above does not segfault on OS X. Where does it return to?
Suplementing what Michael Petch already answered: from runnable Mach-o executable perspective program launch happens either due to load command
LC_MAIN
(most modern executables since 10.7) which utilisesDYLD
in the process or backward compatible load commandLC_UNIXTHREAD
. The former is the variant where yourret
is allowed and in fact preferable because you return control to DYLD __mh_execute_header. This will be followed by a buffer flush. Alternatively toret
you can use system exit call either through undocumentedsyscall
kernel API (64bit,int 0x80
for 32bit) or DYLD wrapper C lib doing it(documented). If your executable is not utilisingLC_MAIN
you're left with legacyLC_UNIXTHREAD
where you have no alternative to system exit call ,ret
will cause asegmentation fault
.MacOS Dynamic Executables
When you are using MacOS and link with:
you are getting a dynamically loaded version of your code.
_start
isn't the true entry point, the dynamic loader is. The dynamic loader as one of its last steps does C/C++/Objective-C runtime initialization, and then calls your specified entry point specified with the-e
option. The Apple documentation about Forking and Executing the Process has these paragraphs:In your case that is
_start
. In this environment where you are creating a dynamically linked executable you can do aret
and have it return back to the code that called_start
which does an exit system call for you. This is why it doesn't crash. If you review the generated object file withgobjdump -Dx foo
you should get:Notice that
start address
is 0. And the code at 0 isdyld_stub_binder
. This is the dynamic loader stub that eventually sets up a C runtime environment and then calls your entry point_start
. If you don't override the entry point it defaults tomain
.MacOS Static Executables
If however you build as a static executable, there is no code executed before your entry point and
ret
should crash since there is no valid return address on the stack. In the documentation quoted above is this:A statically built executable doesn't use the dynamic loader
dyld
withcrt1.o
embedded in it. CRT = C runtime library which covers C++/Objective-C as well on MacOS. The processes of dealing with dynamic loading are not done, C/C++/Objective-C initialization code is not executed, and control is transferred directly to your entry point.To build statically drop the
-lc
(or-lSystem
) from the linker command and add-static
option:If you run this version it should produce a segmentation fault.
gobjdump -Dx foo
producesYou should notice
start_address
is now 0x1fff. 0x1fff is the entry point you specified (_start
). There is no dynamic loader stub as an intermediary.Linux
Under Linux when you specify your own entry point it will segmentation fault whether you are building as a static or shared executable. There is good information on how ELF executables are run on Linux in this article and the dynamic linker documentation. The key point that should be observed is that the Linux one makes no mention of doing C/C++/Objective-C runtime initialisation unlike the MacOS dynamic linker documentation.
The key difference between the Linux dynamic loader (ld.so) and the MacOS one (dynld) is that the MacOS dynamic loader performs C/C++/Objective-C startup initialization by including the entry point from
crt1.o
. The code incrt1.o
then transfers control to the entry point you specified with-e
(default ismain
). In Linux the dynamic loader makes no assumption about the type of code that will be run. After the shared objects are processed and initialized control is transferred directly to the entry point.Stack Layout at Process Creation
FreeBSD (on which MacOS is based) and Linux share one thing in common. When loading 64-bit executables the layout of the user stack when a process is created is the same. The stack for 32-bit processes is similar but pointers and data are 4 bytes wide, not 8.
Although there isn't a return address on the stack, there is other data representing the number of arguments, the arguments, environment variables, and other information. This layout is not the same as what the
main
function in C/C++ expects. It is part of the C startup code to convert the stack at process creation to something compatible with the C calling convention and the expectations of the functionmain
(argc
,argv
,envp
).I wrote more information on this subject in this Stackoverflow answer that shows how a statically linked MacOS executable can traverse through the program arguments passed by the kernel at process creation.