I am building an 32 bit OS in assembly.
I have setup the IDT and I am handling program interruptus through int
instruction.
How can I enable the syscall
and sysenter
instructions and how do I handle them/return?
Is true that syscall
instruction isn't supported in 32 bit by Intel processors so I can't use it?
Is true that sysret
instruction isn't safe?
Do somewhere exist a tutorial for that?
EDIT: My main question is how to enable the syscall
and sysenter
instructions! (No duplication)
syscall
cannot be used on x86, only on x86_64 (portably at least). That being said, on x86_64, the instructions are enabled by loading the correct CS selectors for user-mode and kernel-mode into theIA32_STAR
model-specific register, and then the address of whatever you want to call whensyscall
is executed intoIA32_LSTAR
. You also need to handle the execution context of these instructions carefully, as they clobber some registers etc.I suggest reading up in the manuals - both the Intel manual itself and Volume 2 of the AMD64 manual are good places to start.
See the OSdev wiki for details on
sysenter
, including a note about how to avoid a security/safety problem. Also see the Intel / AMD manuals for that. They go into a lot of the detail that OS developers need. See the x86 tag wiki for links.Overview of the various system-call instructions:
int
: available since forever (8086)far call
). See the OSdev link for details on that and traps.sysenter
: (http://wiki.osdev.org/Sysenter) Introduced by Intel before x86-64 existed, adopted by AMD not long after (many years ago). Available on all modern x86 CPUs. Very minimalist design, requires user-space cooperation for the kernel to be able to return, because it doesn't save EIP, ESP, or EFLAGS anywhere.Linux supports it in 32 and 64-bit kernels for system calls from 32-bit processes only. IDK if you could design a kernel that used it for 64-bit system calls as well / instead. (I know that wasn't the question, but it's related.)
Using
sysenter
requires user-space cooperation to provide the return address and save its own ESP and EFLAGS. In Linux, the kernel exports a page of code which has the user-space side of this dance. User-space is expected tocall
this code instead of usingsysenter
directly, but feel free to design your OS however you want. Looking at Linux's code for both sides of this dance will probably be useful, if you don't find an example somewhere else.syscall
from 64-bit user-space: available everywhere because Intel implemented it along with the rest of AMD64. Well-designed interface that masks RFLAGS (with a configurable mask) before entering the kernel, so you can avoid a race window (if you had to disable interrupts manually withcli
). Used withswapgs
for the kernel to get access to its stack and so on.On mainstream x86 OSes (like Linux),
syscall
is the only way to make 64-bit system calls.syscall
from 32-bit user-space: A totally different instruction from long modesyscall
, only available on AMD CPUs. The kernel-side interface is different for 32-bit kernels (legacy mode) vs. 64-bit kernels running 32-bit user-space (compat mode).The Linux kernel has some useful comments on it:
Maybe a toy OS could use it without worrying about whatever problems make it unsuitable for Linux, IDK. But unless you're just plain curious, don't waste your time with it. OTOH, if you're interested in OS & CPU design, finding out what's wrong with the ISA design might be interesting.
BTW, when AMD was designing AMD64, they got some feedback from Linux kernel devs on the amd64 mailing list that improved the design of 64-bit
syscall
(to configurably mask RFLAGS) because their initial design would have been problematic for Linux. Links to those archived mailing list posts in this answer.Recommendation: Use
sysenter
for your 32-bit kernel. It should be usable everywhere, including on AMD CPUs for years now. Ancient CPUs that don't support it can use theint 0x80
ABI (or whatever number you picked for your OS), if you want to add a 2nd compatibility ABI.The Linux kernel entry points are well commented, and written fairly readably. While writing What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?, I had an easy time figuring out what was going on in the entry points into a 64-bit kernel using
syscall
(native 64-bit system calls), orint 0x80
orsysenter
(32-bit system calls, normally from compat mode butint 0x80
is supported for 64-bit processes. But it still invokes the 32-bit ABI!) There's a bunch of complicated stuff going on in case various kinds of tracing / debugging are enabled, but the other parts are fairly easy to follow. See that answer for a walk-through of some of Linux's system-call handling internals.In
arch/x86/entry
, these are the main files of interest:entry_32.S
: 32-bit kernel code for entry from user-space. (legacy mode)entry_64_compat.S
: 64-bit kernel code for entry from 32-bit user-space (compat mode -> long mode).entry_64.S
: 64-bit kernel code for entry from 64-bit user-space (long mode -> long mode).You should be able to find Linux's VDSO code for the user-space side of the
sysenter
dance that passes the kernel the values it needs to return to user-space. (What is better "int 0x80" or "syscall"?). Related: What is better "int 0x80" or "syscall"?, and The Definitive Guide to Linux System Calls will give some useful info on the design choices Linux made.Intel and AMD both have separate bugs with non-canonical RIP when returning to 64-bit user-space. e.g. on Intel, Linux's
entry_64.S
describes it this way:That can happen if a
ptrace
system call (e.g. made by a debugger) changed the saved value of the process'sRIP
to a non-canonical address. Linux checks whether it can usesysret
, and if not uses itsiret
return path. (Thesysret
path is fast enough that it's worth doing extra work to check that it's safe).Note that if a system call blocks / sleeps, the "master copy" of user-space's integer register state is on its kernel stack, where the system call entry point pushed it. (In Linux. Other designs are possible!) But anyway, this is why it's possible to end up with weird saved state that user-space couldn't have run
syscall
with (because it would have faulted onjmp
to a non-canonical address), or withsaved_rcx != saved_RIP
(64-bitsyscall
sets RCX=RIP, and R11=RFLAGS (before masking), so it clobbers RCX and R11 but allows the kernel to restore RIP and RFLAGS.)I don't know how 32-bit
syscall
works, sorry I got off topic here. But I suspect that what you may have read aboutsysret
being unsafe was talking about 64-bit kernels.IDK if there are any similar bugs in 32-bit-kernel
sysret
, or 64-bit-kernelsysret
-to-compat-mode.At least Wikipedia says this.
And more important: syscall seems not even to be supported by any 32-bit CPU (even not AMD) but only in 32-bit mode of 64-bit AMD CPUs.
So why do you want to use syscall or sysenter?
Nearly all 32-bit x86 OSs use either interrupts (e.g. Linux) or call gates (e.g. Solaris) to enter the kernel...