The following code doesn't work as intended but hopefully illustrates my attempt:
long foo (int a, int b) {
return a + b;
}
void call_foo_from_stack (void) {
/* reserve space on the stack to store foo's code */
char code[sizeof(*foo)];
/* have a pointer to the beginning of the code */
long (*fooptr)(int, int) = (long (*)(int, int)) code;
/* copy foo's code to the stack */
memcpy(code, foo, sizeof(*foo));
/* execute foo from the stack */
fooptr(3, 5);
}
Obviously, sizeof(*foo)
doesn't return the size of the code of the foo()
function.
I am aware that executing the stack is restricted on some CPUs (or at least if a restriction flag is set). Apart from GCC's nested functions that can eventually be stored on the stack, is there a way to do that in standard C?
The reserve and copy parts of your idea are fine. Getting a code pointer to your awesome stack code/data, that's harder. A typecast of the address of your stack to a code pointer should do the trick.
On a managed system, this code should never be allowed to execute. On an embedded system that shares code and data memory, it should work just fine. There are of course caching issues, security issues, job security issues when your peers read the code, etc. with this though...
Aside from all the other problems, I don't think anyone has yet mentioned that code in its final form in memory cannot in general be relocated. Your example
foo
function, maybe, but consider:Part of the result:
Note the
jne 401157 <_main+0x27>
. In this case, we have an x86 conditional near jump instruction0x75 0x09
, which goes 9 bytes forward. So that's relocatable: if we copy the code elsewhere then we still want to go 9 bytes forward. But what if it was a relative jump or call, to code which isn't part of the function that you copied? You'd jump to some arbitrary location on or near your stack.Not all jump and call instructions are like this (not on all architectures, and not even all on x86). Some refer to absolute addresses, by loading the address into a register and then doing a far jump/call. When the code is prepared for execution, the so-called "loader" will "fix up" the code by filling in whatever address the target ends up actually having in memory. Copying such code will (at best) result in code that jumps to or calls the same address as the original. If the target isn't in the code you're copying that's probably what you want. If the target is in the code you're copying then you're jumping to the original instead of to the copy.
The same issues of relative vs. absolute addresses apply to things other than code. For example, references to data sections (containing string literals, global variables, etc) will go wrong if they're addressed relatively and aren't part of the copied code.
Also, a function pointer doesn't necessarily contain the address of the first instruction in the function. For example, on an ARM processor in ARM/thumb interworking mode, the address of a thumb function is 1 greater than the address of its first instruction. In effect, the least significant bit of the value isn't part of the address, it's a flag to tell the CPU to switch to thumb mode as part of the jump.
sizeof(*foo)
isn’t the size of the functionfoo
, it’s the size of a pointer to foo (which will usually be the same size as every other pointer on your platform).sizeof
can’t measure the size of a function. The reason is thatsizeof
is a static operator, and the size of a function is not known at compile time.Since the size of a function is not known at compile time, that also means that you can’t define a statically-size array that is large enough to contain a function.
You might be able to do something horrible using
alloca
and some nasty hacks, but the short answer is no, I don’t think you can do this with standard C.It should also be noted that the stack is not executable on modern, secure operating systems. In some cases you might be able to make it executable, but that is a very bad idea that will leave your program wide open to stack smashing attacks and horrible bugs.
Your problem is roughly similar to dynamically generated code, except that you want to execute from stack instead of a generic memory region.
You'll need to grab enough stack to fit the copy of your function. You can find out how large the foo() function is by compiling it and looking at the resulting assembly. Then hard-code the size of your code[] array to fit at least that much. Also make sure code[], or the way you copy foo() into code[], gives the copied function the correct instruction alignment for your processor architecture.
If your processor has an instruction prefetch buffer then you will need to flush it after the copy and prior to executing the function from stack, or it will almost certainly have prefetched the wrong data and you'll end up executing garbage. Managing the prefetch buffer and associated caches is the biggest stumbling block I've encountered in experimenting with dynamically generated code.
As others have mentioned, if your stack isn't executable then this is a non-starter.