How to determine if returned pointer is on the sta

2019-01-11 14:37发布

站内文章 / C++

27 0

聊天终结者

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a plugin architecture, where I call functions in a dynamic library and they return me a char* which is the answer, it is used at some later stage.

This is the signature of a plugin function:

char* execute(ALLOCATION_BEHAVIOR* free_returned_value, unsigned int* length);

where ALLOCATION_BEHAVIOR must be either: DO_NOT_FREE_ME, FREE_ME, DELETE_ME where the plugin (in the library) tells me how the plugin allocated the string it has just returned: DO_NOT_FREE_ME tells me, this is a variable I'm not supposed to touch (such as a const static char* which never changes) FREE_ME tells me I should use free() to free the returned value and DELETE_ME tells me to use delete[] to get rid of the memory leaks.

Obviously, I don't trust the plugins, so I would like to be able to check that if he tells me to free() the variable, indeed it is something that can really be freed ... Is this possible using todays' C/C++ technology on Linux/Windows?

回答1:

Distinguishing between malloc/free and new/delete is generally not possible, at least not in a reliable and/or portable way. Even more so as new simply wrapps malloc anyway in many implementations.

None of the following alternatives to distinguish heap/stack have been tested, but they should all work.

Linux:

Solution proposed by Luca Tettananti, parse /proc/self/maps to get the address range of the stack.
As the first thing at startup, clone your process, this implies supplying a stack. Since you supply it, you automatically know where it is.
Call GCC's __builtin_frame_address function with increasing level parameter until it returns 0. You then know the depth. Now call __builtin_frame_address again with the maximum level, and once with a level of 0. Anything that lives on the stack must necessarily be between these two addresses.
sbrk(0) as the first thing at startup, and remember the value. Whenever you want to know if something is on the heap, sbrk(0) again -- something that's on the heap must be between the two values. Note that this will not work reliably with allocators that use memory mapping for large allocations.

Knowing the location and size of the stack (alternatives 1 and 2), it's trivial to find out if an address is within that range. If it's not, is necessarily "heap" (unless someone tries to be super smart-ass and gives you a pointer to a static global, or a function pointer, or such...).

Windows:

Use CaptureStackBackTrace, anything living on the stack must be between the returned pointer array's first and last element.
Use GCC-MinGW (and __builtin_frame_address, which should just work) as above.
Use GetProcessHeaps and HeapWalk to check every allocated block for a match. If none match for none of the heaps, it's consequently allocated on the stack (... or a memory mapping, if someone tries to be super-smart with you).
Use HeapReAlloc with HEAP_REALLOC_IN_PLACE_ONLY and with exactly the same size. If this fails, the memory block starting at the given address is not allocated on the heap. If it "succeeds", it is a no-op.
Use GetCurrentThreadStackLimits (Windows 8 / 2012 only)
Call NtCurrentTeb() (or read fs:[18h]) and use the fields StackBase and StackLimit of the returned TEB.

回答2:

I did the same question a couple of years ago on comp.lang.c, I liked the response of James Kuyper:

Yes. Keep track of it when you allocate it.

The way to do this is to use the concept of ownership of memory. At all times during the lifetime of a block of allocated memory, you should always have one and only one pointer that "owns" that block. Other pointers may point into that block, but only the owning pointer should ever be passed to free().

If at all possible, an owning pointer should be reserved for the purpose of owning pointers; it should not be used to store pointers to memory it does not own. I generally try to arrange that an owning pointer is initialized with a call to malloc(); if that's not feasible, it should be set to NULL sometime before first use. I also try to make sure that the lifetime of an owning pointer ends immediately after I free() the memory it owns. However, when that's not possible, set it to NULL immediately after free()ing that memory. With those precautions in place, you should not let the lifetime of a non-null owning pointer end without first passing it to free().

If you have trouble keeping track of which pointers are 'owning' pointers, put a comment about that fact next to their declaration. If you have lots of trouble, use a naming convention to keep track of this feature.

If, for any reason, it is not possible to reserve an owning pointer variable exclusively for ownership of the memory it points at, you should set aside a separate flag variable to keep track of whether or not that pointer currently owns the memory it points at. Creating a struct that contains both the pointer and the ownership flag is a very natural way to handle this - it ensures that they don't get separated.

If you have a rather complicated program, it may be necessary to transfer ownership of memory from one owning pointer variable to another. If so, make sure that any memory owned by target pointer is free()d before the transfer, and unless the lifetime of the source pointer ends immediately after the transfer, set the source pointer to NULL. If you're using ownership flags, reset them accordingly.

回答3:

The plugin/library/whatever should not be returning an enum through a passed 'ALLOCATION_BEHAVIOR*' pointer. It's messy, at best. The 'deallocation' scheme belongs with the data and should be encapsulated with it.

I would prefer to return an object pointer of some base class that has a virtual 'release()' function member that the main app can call whenever it wants/needs to and handles the 'dealloaction' as required for that object. release() could do nothing, repool the object in a cache specified in a private data memebr of the object, of just delete() it, depending on whatever override is applied by the plugin subclasses.

If this is not possible because the plugin is written in a different language, or built with a different compiler, the plugin could return a function as well as the data so that the main app can call it back with the data pointer as a parameter for the purpose of deallocation. This at least allows you to put the char* and function* into the same object/struct on the C++ side, so maintaining at least some semblance of encapsulation and allowing the plugin to choose any deallocation scheme it wants to.

Edit - a scheme like this would also work safely if the plugin used a different heap than the main app - maybe it's in a DLL that has its own sub-allocator.

回答4:

On Linux you can parse /proc/self/maps to extract the location of the stack and of the heap and then check whether the pointer falls into one of ranges.

This won't tell you if the memory should be handled by free or delete though. If you control the architecture you can let the plugin free the allocated memory adding the appropriate API (IOW, a plugin_free function that is symmetrical to your execute). Another common pattern is to keep track of the allocations in a context object (created at init time) that is passed to the plugin at each call and is then used by the plugin at shutdown to do the clean up.

回答5:

How do they allocate something on the stack that you can then free, as they've returned? That's just going to die horribly. Even using it is going to die horribly.

If you want to check whether they've returned you pointer to static data, then you probably want to get hold of your heap top and bottom (which I'm pretty sure is available on linux, using sbrk), and see if the returned pointer is in that range or not.

Of course, it's possible that even a valid pointer in that range shouldn't be freed because they've stashed another copy to it which they're going to use later. And if you're not going to trust them, you should not trust them at all.

回答6:

I'm using the following code to check student assignments. Returning stack memory is a common pitfall, so I wanted to automatically check for it.

Using `sbrk`

This method should work on all Unix variants and on all CPU architectures.

#include <unistd.h>
#include <stdlib.h>
#include <stdbool.h>
#include <assert.h>

bool points_to_heap(void* init_brk, void* pointer){
    void* cur_brk = sbrk(0);
    return ((init_brk <= pointer) && (pointer <= cur_brk));
}

int main(void){
    void* init_brk = sbrk(0);
    int* heapvar = malloc(10);
    int i = 0;
    int* stackvar = &i;
    assert(points_to_heap(init_brk, heapvar));
    assert(!points_to_heap(init_brk, stackvar));
    return 0;
}

Using `/proc/self/maps`

Two issues with this method:

This code is specific to Linux running on a 64-bit x86 CPU.
This method doesn't seem to work in unit tests written using the libcheck framework. There, all stack variables are also seen as heap variables.

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

void get_heap_bounds(uint64_t* heap_start, uint64_t* heap_end){
    FILE *stream;
    char *line = NULL;
    size_t len = 0;
    ssize_t nread;

    stream = fopen("/proc/self/maps", "r");

    while ((nread = getline(&line, &len, stream)) != -1) {
        if (strstr(line, "[heap]")){
            sscanf(line, "%" SCNx64 "-%" SCNx64 "", heap_start, heap_end);
            break;
        }
    }

    free(line);
    fclose(stream);
}

bool is_heap_var(void* pointer){
    uint64_t heap_start = 0;
    uint64_t heap_end = 0;
    get_heap_bounds(&heap_start, &heap_end);

    if (pointer >= (void*)heap_start && pointer <= (void*)heap_end){
        return true;
    }
    return false;
}

Feedback on this code is welcome!

回答7:

You have to use some debugging tools to determine whether the pointer is on stack or on heap. On windows, download Sysinternals Suite. This provides various tool for debugging.