Constraining r10 register in gcc inline x86_64 ass

2020-03-13 06:17发布

问题:

I'm having a go at writing a very light weight libc replacement library so that I can better understand the kernel - application interface. The first task is clearly getting some system call wrappers in place. I've successfully got 1 to 3 argument wrappers working but I'm struggling with a 4 argument varient. Here's my starting point:

long _syscall4(long type, long a1, long a2, long a3, long a4)
{
    long ret;
    asm
    (
        "syscall"
        : "=a"(ret)
        : "a"(type), "D"(a1), "S"(a2), "d"(a3), "r10"(a4)
        : "c", "r11"
    );
    return ret;
}

The compiler gives me the following error:

error: matching constraint references invalid operand number

My _syscall3 function works fine but doesnt use r10 or have a clobber list.

Any thoughts?

回答1:

There are no constraints for registers: %r8 .. %15. However, more recent (as in gcc-4.x) should accept:

register long r10 asm("r10") = a4;

then use the input constraint: "r" (r10) for your asm statement.
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html


Note that forcing the choice of an "r" constraint for Extended asm is the only behaviour that GCC guarantees for register-asm locals. Things like register void *rsp asm("rsp"); and void *stack_pointer = rsp; do sometimes work, but are not guaranteed and not recommended anymore.


You're going to want your syscall wrapper asm statement to be volatile and have a "memory" clobber, unless you write specific wrappers for specific system calls to know which args are pointers, using a dummy memory input or output (as per How can I indicate that the memory *pointed* to by an inline ASM argument may be used?)

It needs to volatile because doing write(1, buf, 16) should print the buffer twice, not just CSE the return value! System calls are in general not Pure functions of their inputs, so you need volatile.

(Some specific system call wrappers like getpid could be non-volatile, because they do return the same thing every time, unless you also use fork. But getpid is more efficient if done through the VDSO so it doesn't have to enter the kernel in the first place if you're on Linux, so if you're making a custom wrapper for getpid and clock_gettime you probably don't want syscall in the first place. See The Definitive Guide to Linux System Calls)

The "memory" clobber is needed because a pointer in a register does not imply that the pointed-to memory is also an input or output. Stores to a buffer that are only read by a write system call need to not be optimized away as dead stores. Or for munmap, the compiler had better have finished any loads/stores before the memory is unmapped. Some system calls don't take any pointer inputs, and don't need "memory", but a generic wrapper has to make worst-case assumptions.

register ... asm("r10") does not in general require asm volatile or "memory" clobbers, but a syscall wrapper does.



回答2:

Presumably because no instructions have specific requirement for r10 register, the gcc folks didn't create a constraint for it (given that the constraints are primarily for the machine descriptions). If you insist on inline asm I don't think you can do better than using a generic "r" (or "m") constraint and moving into r10 yourself (and adding it to the clobber list).