Setup:
- Ubuntu 18x64
- x86_64 application
- Arbitrary code execution from inside the application
I'm trying to write code which should be able to find structures in memory even with ASLR enabled. Sadly, I couldn't find any static references to those regions, so I'm guessing I have to use the bruteforce way and scan the process memory. What I tried to do was to scan the whole address space of the application, but that doesn't work as some memory areas are not allocated and therefore yield SIGSEGV
when accessed. Now I'm thinking it would be a good idea to getpid()
, then use the pid to access /proc/$PID/maps
and try to parse the data from there.
But I wonder, is there a better way to identify allocated regions? Maybe even a way that doesn't require me to access libc (=getpid, open, close
) or fiddle around with strings?
I don't think there's any standard POSIX API for this.
Parsing
/proc/self/maps
is your best bet. (There may be a library to help with this, but IDK).You tagged this ASLR, though. If you just want to know where the text / data / bss segments are, you can put labels at the start/end of them so those addresses are available in C. e.g.
extern const char bss_end[];
would be a good way to reference a label you put at the end of the BSS using a linker script and maybe some hand-written asm. The compiler-generated asm will use a RIP-relative LEA instruction to get the address in a register relative to the current instruction address (which the CPU knows because it's executing the code mapped there).Or maybe just a linker script and declaring dummy C variables in custom sections.
I'm not sure if you can do that for the stack mapping. With a large environment and/or argv, the initial stack on entry to
main()
or even_start
might not be in the same page as the highest address in the stack mapping.To scan, you either need to catch
SIGSEGV
or scan with system calls instead of user-space loads or stores.mmap
andmprotect
can't query the old setting, so they're not very useful for non-destructive stuff.mmap
with a hint but withoutMAP_FIXED
could map a page, and then you couldmunmap
it. If the actual chosen address != hint, then you could assume the address was in use.Maybe a better option would be to scan with
madvise(MADV_NORMAL)
and check forEFAULT
, but only one page at a time.You could even do this portably with
errno=0; posix_madvise(page, 4096, POSIX_MADV_NORMAL)
. Then checkerrno
:ENOMEM
: Addresses in the specified range are partially or completely outside the caller's address space.On Linux with
madvise(2)
you could useMADV_DOFORK
or something that's even less likely to be at a non-default setting for each page.But on Linux, an even better choice for read-only querying the process memory mapping is
mincore(2)
: It also uses the error codeENOMEM
for an invalid addresses in the queried range. "addr
toaddr + length
contained unmapped memory". (EFAULT
is for the result vector pointing to unmapped memory, not addr).Only the
errno
result is useful; thevec
result shows you whether pages are hot in RAM or not. (I'm not sure if it shows you which pages are wired into the HW page tables, or if it would count a page that's resident in memory in the pagecache for a memory mapped file but not wired, so an access would trigger a soft page fault).You can binary-search for the end of a large mapping by calling
mincore
with larger lengths.But unfortunately I don't see any equivalent for finding the next mapping after an unmapped page, which would be much more useful because most of the address-space will be unmapped. Especially in x86-64 with 64-bit addresses!
For sparse files there's
lseek(SEEK_DATA)
. I wonder if that works on Linux's/proc/self/mem
? probably not.So maybe large (like 256MB)
(tmp=mmap(page, blah blah)) == page
calls would be a good way to scan through unmapped regions looking for mapped pages. Either way you simplymunmap(tmp)
, whethermmap
used your hint address or not.Parsing
/proc/self/maps
is almost certainly more efficient.But the most efficient thing would be putting labels where you want them for static addresses, and tracking dynamic allocations so you already know where your memory is. This works if you have no memory leaks. (glibc
malloc
might have an API to walk the mappings, but I'm not sure.)Note that any system call will produce an
errno=EFAULT
if you pass it an unmapped address for a parameter that's supposed to point to something.One possible candidate is
access(2)
, which takes a filename and returns an integer. It has zero effect on the state of anything else, success or fail, but the downside is filesystem access if the pointed-to memory is a valid path string. And it's looking for an implicit-length C string so could also be slow if passed a pointer to memory with no0
byte anywhere soon. I guessENAMETOOLONG
would kick in, but it still definitely reads every accessible page you use it on, faulting it in even if it was paged out.If you open a file descriptor on
/dev/null
, you could makewrite()
system calls with that. Or even withwritev(2)
:writev(devnull_fd, io_vec, count)
to pass the kernel a vector of pointers in one system call, and get an EFAULT if any of them are bad. (With lengths of 1 byte each). But (unless the/dev/null
driver skips reads early enough) this does actually read from pages that are valid, faulting them in unlikemincore()
. Depending how it's implemented internally, the/dev/null
driver might see the request early enough for its "return true"-without-doing-anything implementation to avoid actually touching pages after checking for EFAULT. Would be interesting to check.