Is it possible to load a function into some alloca

2019-05-07 05:14发布

问题:

I'm messing around with some interprocess communication stuff and I am curious if it's possible to copy a function into some shared memory and run it from there from either process.

Something like:

memcpy(shared_memory_address, &func, &func + sizeof(func));

I realize you can't take the size of the function but that was what popped into my head.

回答1:

Theoretically, as functions are just sequence of byte code somewhere in the memory, you could copy the memory block of the function and call (jump into) it. Though c++ Abstracts that possibility away, as you noticed, we cannot actually know the size of function (although we can get pointer to it).

Still, there's libraries. For example, you could tell remote executable to load specific function from dynamic library and execute it. Check wikipedia-article for the references.



回答2:

That was fun.
But it seems like you can. Though I would NEVER do this:

Compiled on lenovo:T61p running Windows 7: using g++ 4.3.4

I would note that some types of hardware will prevent this as you can only execute code from specific memory area (the program area) which is marked in the hardware memory map file as read only (to prevent self modifying code).

Note also that the type of function is very limited:

In this example func() does very a little and therefore works.
But if you do any of the following it will not be portable to another processes:

  • Call a function or method.
  • Pass a pointer (or reference)
    • No object that contains a pointer or a reference will work either.
  • Use globals.
  • You could pass a method pointer:
    • But object it is used on must be passed by value.

None of the above work because the address space of one process bares no resemblance to the address space of another processes (As it is mapped at the hardware level to physical memory).

Silly Example

#include <vector>
#include <iostream>
#include <string.h>

int func(int x)
{
    return x+1;
}

typedef int (*FUNC)(int);


int main()
{
    std::vector<char>   buffer(5000);

    ::memcpy(&buffer[0],reinterpret_cast<char*>(&func),5000);

    FUNC func   = reinterpret_cast<FUNC>(&buffer[0]);

    int result  = (*func)(5);

    std::cout << result << std::endl;

}


回答3:

Last time I tried this, I ran into a road block: determining the number of bytes in the function. The task would be to use the address of the function, copy the bytes into memory (provided the code is compiled as Position Independent Code, PIC).

A more platform independent method is to review your compiler documentation to see if there is a #pragma, compiler option, or keyword that allows you to specify the function's address or segment to load at during load time.

Also, search the Embedded Systems groups, as this is a popular technique: Load code that programs a Flash Memory into RAM, execute the function in RAM, then reset the system.

Hope that helps.

Edit:
A suggestion: create a data or code segment using either an assembly language file or instructions to the linker (in the build script). Put your function into a separate code file. Tell the compiler and linker to compile this function into the new code segment. There may be compiler specific statements to get the starting address and size of a segment. Also, the OS may be able to load a segment at a given address for you.

Also look into DLLs or Shared Libraries which can be loaded during run-time, with the help of the OS.



回答4:

If you attempt such a thing, you may run into problems running code from memory which isn't supposed to contain executable code. See this Wikipedia article for more information: http://en.wikipedia.org/wiki/Executable_space_protection



回答5:

Yes. A similar technique is used by Just-In-Time code generators such as the Java VM. In fact you could almost say that the operating system's runtime loader and linker is doing this for you as it loads dynamic libraries into your process.

You do have to request executable memory from the operating system, though. And the code you are jumping into has to be written in a way that allows it to be located anywhere in memory (position independent).



回答6:

If you generate code bytes and inject it into the process, thats called Run-time code generation (RTCG). You can look up some examples.

Modern kernels would prevent this to work from a non-privileged level, so you have to enter the correct mode or ring first. In order to find the code size, you have (of course) to count the bytes of the function it's code segment until the last return code.

Afaik graphics drivers sometimes used RTCG when creating code for raster ops on the fly (problem dependend).



回答7:

You can reasonably assume that is flatly impossible on Linux, Windows, or the more sophisticated embedded operating systems.

But if you are not operating with such pesky restrictions, you can patch in some guard bytes in your assembly that denote begin/end of functions and use those to help you copy stuff out to your shared memory (using assembly of course), then publish a list of procedure addresses to any interested process (also accessing/running using assembly).

Of course, there is a well-defined mechanism for providing libraries of code for multiple processes, the dynamic library system Linux and Windows provides. Probably not as flexible as you'd like though. :-)