I had a little too much time on my hands and started wondering if I could write a self-modifying program. To that end, I wrote a "Hello World" in C, then used a hex editor to find the location of the "Hello World" string in the compiled executable. Is it possible to modify this program to open itself and overwrite the "Hello World" string?
char* str = "Hello World\n";
int main(int argc, char* argv) {
printf(str);
FILE * file = fopen(argv, "r+");
fseek(file, 0x1000, SEEK_SET);
fputs("Goodbyewrld\n", file);
fclose(file);
return 0;
}
This doesn't work, I'm assuming there's something preventing it from opening itself since I can split this into two separate programs (A "Hello World" and something to modify it) and it works fine.
EDIT: My understanding is that when the program is run, it's loaded completely into ram. So the executable on the hard drive is, for all intents and purposes a copy. Why would it be a problem for it to modify itself?
Is there a workaround?
Thanks
If we're talking about doing this in an x86 environment it shouldn't be impossible. It should be used with caution though because x86 instructions are variable-length. A long instruction may overwrite the following instruction(s) and a shorter one will leave residual data from the overwritten instruction which should be noped (NOP instruction).
When the x86 first became protected the intel reference manuals recommended the following method for debugging access to XO (execute only) areas:
So the answer to the problem is in the last step. The RW is necessary if you want to be able to insert the breakpoint instruction which is what debuggers do. More modern processors than the 80286 have internal debug registers to enable non-intrusive monitoring functionality which could result in a breakpoint being issued.
Windows made available the building blocks for doing this starting with Win16. They are probably still in place. I think Microsoft calls this class of pointer manipulation "thunking."
I once wrote a very fast 16-bit database engine in PL/M-86 for DOS. When Windows 3.1 arrived (running on 80386s) I ported it to the Win16 environment. I wanted to make use of the 32-bit memory available but there was no PL/M-32 available (or Win32 for that matter).
to solve the problem my program used thunking in the following way
Once the mechanism was bug free it worked without a hitch. The largest memory areas my program used were 2304*2304 double precision which comes out to around 40MB. Even today, I would call this a "large" block of memory. In 1995 it was 30% of a typical SDRAM stick (128 MB PC100).
On newer versions of Windows CE (atleast 5.x an newer) where apps run in user space, (compared to earlier versions where all apps ran in supervisor mode), apps cannot even read it's own executable file.
All present answers more or less revolve around the fact that today you cannot easily do self-modifying machine code anymore. I agree that that is basically true for today's PCs.
However, if you really want to see own self-modifying code in action, you have some possibilities available:
Try out microcontrollers, the simpler ones do not have advanced pipelining. The cheapest and quickest choice I found is an MSP430 USB-Stick
If an emulation is ok for you, you can run an emulator for an older non-pipelined platform.
If you wanted self-modifying code just for the fun of it, you can have even more fun with self-destroying code (more exactly enemy-destroying) at Corewars.
If you are willing to move from C to say a Lisp dialect, code that writes code is very natural there. I would suggest Scheme which is intentionally kept small.