How is the address of the text section of a PIE ex

2020-02-24 05:14发布

问题:

First I tried to reverse engineer it a bit:

printf '
#include <stdio.h>
int main() {
    puts("hello world");
}
' > main.c
gcc -std=c99 -pie -fpie -ggdb3 -o pie main.c
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
readelf -s ./pie | grep -E 'main$'
gdb -batch -nh \
  -ex 'set disable-randomization off' \
  -ex 'start' -ex 'info line' \
  -ex 'start' -ex 'info line' \
  -ex 'set disable-randomization on' \
  -ex 'start' -ex 'info line' \
  -ex 'start' -ex 'info line' \
  ./pie \
;

Output:

64: 000000000000063a    23 FUNC    GLOBAL DEFAULT   14 main
Temporary breakpoint 1, main () at main.c:4
4           puts("hello world");
Line 4 of "main.c" starts at address 0x5575f5fd263e <main+4> and ends at 0x5575f5fd264f <main+21>.
Temporary breakpoint 2 at 0x5575f5fd263e: file main.c, line 4.

Temporary breakpoint 2, main () at main.c:4
4           puts("hello world");
Line 4 of "main.c" starts at address 0x55e3fbc9363e <main+4> and ends at 0x55e3fbc9364f <main+21>.
Temporary breakpoint 3 at 0x55e3fbc9363e: file main.c, line 4.

Temporary breakpoint 3, main () at main.c:4
4           puts("hello world");
Line 4 of "main.c" starts at address 0x55555555463e <main+4> and ends at 0x55555555464f <main+21>.
Temporary breakpoint 4 at 0x55555555463e: file main.c, line 4.

Temporary breakpoint 4, main () at main.c:4
4           puts("hello world");
Line 4 of "main.c" starts at address 0x55555555463e <main+4> and ends at 0x55555555464f <main+21>.

which indicates that it is 0x555555554000 + random offset + 63e.

But then I tried to grep the Linux kernel and glibc source code for 555555554 and there were no hits.

Which part of which code calculates that address?

I came across this while answering: What is the -fPIE option for position-independent executables in gcc and ld?

回答1:

Some Internet search of 0x555555554000 gives hints: there were problems with ThreadSanitizer https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual

Q: When I run the program, it says: FATAL: ThreadSanitizer can not mmap the shadow memory (something is mapped at 0x555555554000 < 0x7cf000000000). What to do? You need to enable ASLR:

 $ echo 2 >/proc/sys/kernel/randomize_va_space

This may be fixed in future kernels, see https://bugzilla.kernel.org/show_bug.cgi?id=66721 ...

 $ gdb -ex 'set disable-randomization off' --args ./a.out

and https://lwn.net/Articles/730120/ "Stable kernel updates." Posted Aug 7, 2017 20:40 UTC (Mon) by hmh (subscriber) https://marc.info/?t=150213704600001&r=1&w=2 (https://patchwork.kernel.org/patch/9886105/, commit c715b72c1ba4)

Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000 broke AddressSanitizer. This is a partial revert of:

  • commit eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") (https://patchwork.kernel.org/patch/9807325/ https://lkml.org/lkml/2017/6/21/560)
  • commit 02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB") (https://patchwork.kernel.org/patch/9807319/)

Reverted code was:

b/arch/arm64/include/asm/elf.h
 /*
  * This is the base location for PIE (ET_DYN with INTERP) loads. On
- * 64-bit, this is raised to 4GB to leave the entire 32-bit address
+ * 64-bit, this is above 4GB to leave the entire 32-bit address   * space open for things that want to use the area for 32-bit pointers.   */
-#define ELF_ET_DYN_BASE      0x100000000UL
+#define ELF_ET_DYN_BASE      (2 * TASK_SIZE_64 / 3)


+++ b/arch/x86/include/asm/elf.h
 /*
  * This is the base location for PIE (ET_DYN with INTERP) loads. On
- * 64-bit, this is raised to 4GB to leave the entire 32-bit address
+ * 64-bit, this is above 4GB to leave the entire 32-bit address
  * space open for things that want to use the area for 32-bit pointers.
  */
 #define ELF_ET_DYN_BASE      (mmap_is_ia32() ? 0x000400000UL : \
-                       0x100000000UL)
+                       (TASK_SIZE / 3 * 2))

So, 0x555555554000 is related with ELF_ET_DYN_BASE macro (referenced in fs/binfmt_elf.c for ET_DYN as not randomized load_bias) and for x86_64 and arm64 it is like 2/3 of TASK_SIZE. When there is no CONFIG_X86_32, x86_64 has TASK_SIZE of 2^47 - one page in arch/x86/include/asm/processor.h

/*
 * User space process size. 47bits minus one guard page.  The guard
 * page is necessary on Intel CPUs: if a SYSCALL instruction is at
 * the highest possible canonical userspace address, then that
 * syscall will enter the kernel with a non-canonical return
 * address, and SYSRET will explode dangerously.  We avoid this
 * particular problem by preventing anything from being mapped
 * at the maximum canonical address.
 */
#define TASK_SIZE_MAX   ((1UL << 47) - PAGE_SIZE)

Older versions:

/*
 * User space process size. 47bits minus one guard page.
 */
#define TASK_SIZE_MAX   ((1UL << 47) - PAGE_SIZE)

Newer versions also have support of 5level with __VIRTUAL_MASK_SHIFT of 56 bit - v4.17/source/arch/x86/include/asm/processor.h (but don't want to use it before enabled by user + commit b569bab78d8d ".. Not all user space is ready to handle wide addresses")).

So, 0x555555554000 is rounded down (by load_bias = ELF_PAGESTART(load_bias - vaddr);, vaddr is zero) from the formula (2^47-1page)*(2/3) (or 2^56 for larger systems):

$ echo 'obase=16; (2^47-4096)/3*2'| bc -q
555555554AAA
$ echo 'obase=16; (2^56-4096)/3*2'| bc -q
AAAAAAAAAAA000

Some history of 2/3 * TASK_SIZE:

  • commit 9b1bbf6ea9b2 "use ELF_ET_DYN_BASE only for PIE" has usefull comments: "The ELF_ET_DYN_BASE position was originally intended to keep loaders away from ET_EXEC binaries ..."

  • Don't overflow 32bits with 2*TASK_SIZE "[uml-user] [PATCH] x86, UML: fix integer overflow in ELF_ET_DYN_BASE", 2015 and "ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE", 2015:

Almost all arches define ELF_ET_DYN_BASE as 2/3 of TASK_SIZE. Though it seems that some architectures do this in a wrong way. The problem is that 2*TASK_SIZE may overflow 32-bits so the real ELF_ET_DYN_BASE becomes wrong. Fix this overflow by dividing TASK_SIZE prior to multiplying: (TASK_SIZE / 3 * 2)

  • Same in 4.x, 3.y, 2.6.z, (where is davej-history git repo? archive and at or.cz) 2.4.z, ... added in 2.1.54 of 06-Sep-1997
diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h
+/* This is the location that an ET_DYN program is loaded if exec'ed.  Typical
+   use of this is to invoke "./ld.so someprog" to test out a new version of
+   the loader.  We need to make sure that it is out of the way of the program
+   that it will "exec", and that there is sufficient room for the brk.  */
+
+#define ELF_ET_DYN_BASE         (2 * TASK_SIZE / 3)