Can a C program modify its executable file?

2019-02-03 02:28发布

I had a little too much time on my hands and started wondering if I could write a self-modifying program. To that end, I wrote a "Hello World" in C, then used a hex editor to find the location of the "Hello World" string in the compiled executable. Is it possible to modify this program to open itself and overwrite the "Hello World" string?

char* str = "Hello World\n";

int main(int argc, char* argv) {

  printf(str);

  FILE * file = fopen(argv, "r+");

  fseek(file, 0x1000, SEEK_SET);
  fputs("Goodbyewrld\n", file);      
  fclose(file);    

  return 0;
}

This doesn't work, I'm assuming there's something preventing it from opening itself since I can split this into two separate programs (A "Hello World" and something to modify it) and it works fine.

EDIT: My understanding is that when the program is run, it's loaded completely into ram. So the executable on the hard drive is, for all intents and purposes a copy. Why would it be a problem for it to modify itself?

Is there a workaround?

Thanks

9条回答
你好瞎i
2楼-- · 2019-02-03 03:12

If we're talking about doing this in an x86 environment it shouldn't be impossible. It should be used with caution though because x86 instructions are variable-length. A long instruction may overwrite the following instruction(s) and a shorter one will leave residual data from the overwritten instruction which should be noped (NOP instruction).

When the x86 first became protected the intel reference manuals recommended the following method for debugging access to XO (execute only) areas:

  1. create a new, empty selector ("high" part of far pointers)
  2. set its attributes to that of the XO area
  3. the new selector's access properties must be set RO DATA if you only want to look at what's in it
  4. if you want to modify the data the access properties must be set to RW DATA

So the answer to the problem is in the last step. The RW is necessary if you want to be able to insert the breakpoint instruction which is what debuggers do. More modern processors than the 80286 have internal debug registers to enable non-intrusive monitoring functionality which could result in a breakpoint being issued.

Windows made available the building blocks for doing this starting with Win16. They are probably still in place. I think Microsoft calls this class of pointer manipulation "thunking."


I once wrote a very fast 16-bit database engine in PL/M-86 for DOS. When Windows 3.1 arrived (running on 80386s) I ported it to the Win16 environment. I wanted to make use of the 32-bit memory available but there was no PL/M-32 available (or Win32 for that matter).

to solve the problem my program used thunking in the following way

  1. defined 32-bit far pointers (sel_16:offs_32) using structures
  2. allocated 32-bit data areas (<=> >64KB size) using global memory and received them in 16-bit far pointer (sel_16:offs_16) format
  3. filled in the data in the structures by copying the selector, then calculating the offset using 16-bit multiplication with 32-bit results.
  4. loaded the pointer/structure into es:ebx using the instruction size override prefix
  5. accessed the data using a combination of the instruction size and operand size prefixes

Once the mechanism was bug free it worked without a hitch. The largest memory areas my program used were 2304*2304 double precision which comes out to around 40MB. Even today, I would call this a "large" block of memory. In 1995 it was 30% of a typical SDRAM stick (128 MB PC100).

查看更多
Evening l夕情丶
3楼-- · 2019-02-03 03:17

On newer versions of Windows CE (atleast 5.x an newer) where apps run in user space, (compared to earlier versions where all apps ran in supervisor mode), apps cannot even read it's own executable file.

查看更多
叼着烟拽天下
4楼-- · 2019-02-03 03:24

All present answers more or less revolve around the fact that today you cannot easily do self-modifying machine code anymore. I agree that that is basically true for today's PCs.

However, if you really want to see own self-modifying code in action, you have some possibilities available:

  • Try out microcontrollers, the simpler ones do not have advanced pipelining. The cheapest and quickest choice I found is an MSP430 USB-Stick

  • If an emulation is ok for you, you can run an emulator for an older non-pipelined platform.

  • If you wanted self-modifying code just for the fun of it, you can have even more fun with self-destroying code (more exactly enemy-destroying) at Corewars.

  • If you are willing to move from C to say a Lisp dialect, code that writes code is very natural there. I would suggest Scheme which is intentionally kept small.

查看更多
登录 后发表回答