Unlimited stack cannot grow beyond the initial 132

2019-08-31 05:24发布

问题:

I'm running some experiments with stack and the following got me stuck.

It can be seen that Linux has initial [stack] mapping 132KiB in size. In case of ulimit -s unlimited we can expand the stack any further if we adjust rsp accordingly. So I set ulimit -s unlimited and ran the following program:

PAGE_SIZE     equ 0x1000

;mmap staff
PROT_READ     equ 0x01
PROT_WRITE    equ 0x02
MAP_ANONYMOUS equ 0x20
MAP_PRIVATE   equ 0x02
MAP_FIXED     equ 0x10

;syscall numbers
SYS_mmap      equ 0x09
SYS_exit      equ 0x3c

section .text

global _start

_start:
    ; page alignment
    and rsp, -0x1000

    ; call mmap 0x101 pages below the rsp with fixed mapping
    mov rax, SYS_mmap
    lea rdi, [rsp - 0x101 * PAGE_SIZE]
    mov rsi, PAGE_SIZE
    mov rdx, PROT_READ | PROT_WRITE
    mov r10, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED
    mov r8, -1
    mov r9, 0
    syscall

    sub rsp, 0x80 * PAGE_SIZE
    mov qword [rsp], -1 ; SEGV

    mov rax, SYS_exit
    mov rdi, 0
    syscall

Even in spite of adjusting the rsp it segfaults anyway. I don't really get the point. I manually created a fixed mapping at the address rsp - 0x101 * PAGE_SIZE 101 pages below the rsp.

My expectation was that it would not interfere with expanding the stack (rsp - 0x80 in my case) till we hit the fixed mapping rsp - 0x101 * PAGE_SIZE.

Btw, If I remove MAP_FIXED from the mapping it is not honored and no segfault occurs (as expected). Here is the strace output:

mmap(0x7ffe4e0fe000, 4096, PROT_READ|PROT_WRITE, 
     MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x1526e3f3a000

But MAP_FIXED does the job:

mmap(0x7ffd8979c000, 4096, PROT_READ|PROT_WRITE, 
     MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ffd8979c000

UPD: The segfault is not triggered if lea rdi, [rsp - 0x101 * PAGE_SIZE] is replaced with lea rdi, [rsp - 0x200 * PAGE_SIZE].

回答1:

Linux kernel enforces a gap between the stack and other mappings. If that gap can not be maintained then the stack will not grow.

Relevant source code in mm/mmap.c, from line 2498

/* enforced gap between the expanding stack and other mappings. */
unsigned long stack_guard_gap = 256UL<<PAGE_SHIFT;

static int __init cmdline_parse_stack_guard_gap(char *p)
{
    unsigned long val;
    char *endptr;

    val = simple_strtoul(p, &endptr, 10);
    if (!*endptr)
        stack_guard_gap = val << PAGE_SHIFT;

    return 0;
}
__setup("stack_guard_gap=", cmdline_parse_stack_guard_gap);

and line 2424:

int expand_downwards(struct vm_area_struct *vma,
                   unsigned long address)
{
    struct mm_struct *mm = vma->vm_mm;
    struct vm_area_struct *prev;
    int error = 0;

    address &= PAGE_MASK;
    if (address < mmap_min_addr)
        return -EPERM;

    /* Enforce stack_guard_gap */
    prev = vma->vm_prev;
    /* Check that both stack segments have the same anon_vma? */
    if (prev && !(prev->vm_flags & VM_GROWSDOWN) &&
            (prev->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
        if (address - prev->vm_end < stack_guard_gap)
            return -ENOMEM;
    }

You can see it's adjustable via kernel parameter but the default is 256. Thus this gap does not fit between 0x80 and 0x101 pages, but does fit if you use 0x200.