I'm studying system call handling process in linux.
I found that the entry_SYSCALL_64 function is called when the user process run syscall instruction to call system call.
This function save interrupt frame.
However, when it push the ip to interrupt frame, it read not rip but rcx.
This code is in blow.
ENTRY(entry_SYSCALL_64)
UNWIND_HINT_EMPTY
/*
* Interrupts are off on entry.
* We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
* it is too small to ever cause noticeable irq latency.
*/
swapgs
/*
* This path is only taken when PAGE_TABLE_ISOLATION is disabled so it
* is not required to switch CR3.
*/
movq %rsp, PER_CPU_VAR(rsp_scratch)
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
/* Construct struct pt_regs on stack */
pushq $__USER_DS /* pt_regs->ss */
pushq PER_CPU_VAR(rsp_scratch) /* pt_regs->sp */
pushq %r11 /* pt_regs->flags */
pushq $__USER_CS /* pt_regs->cs */
pushq %rcx /* pt_regs->ip ********************************** */
GLOBAL(entry_SYSCALL_64_after_hwframe)
pushq %rax /* pt_regs->orig_ax */
PUSH_AND_CLEAR_REGS rax=$-ENOSYS
TRACE_IRQS_OFF
/* IRQs are off. */
movq %rsp, %rdi
call do_syscall_64 /* returns with IRQs disabled */
....
I entered the so many '*' in important line.
In the comment, it saying that it save ip register and this offset is offset of ip correctly.
However, it read rcx....
Does anyone know why?
Because the
syscall
instruction stores the address of the instruction followingsyscall
into RCX before entering the kernel.