This malloc shouldn't work

2020-04-26 00:17发布

问题:

Here is my code.

 int     main()
  {
  char *s;
  int i = 0;

  printf("%lu \n", sizeof(s));

  s = malloc(sizeof(char) * 2);

  printf("%lu \n", sizeof(s));
  /*Why is this working?*/
  while (i <= 5)
    {
      s[i] = 'l';
      i++;
    }
  printf("%s \n", s);

  printf("%lu \n", sizeof(char));

  printf("%lu \n", sizeof(s[0]));
  }

In my opinion, this should segfault as I'm trying to write more than I allocated. Why is this working?

回答1:

In illustration (and complete agreement with) @Ed S's answer, Try this example of your code with the small additions of an additional variable declared exactly the same way and malloc'ed right after the char *s.

Although no guarantees that the variables are stored sequentially in memory, creating them this way makes it a high probability. If so, char *t will now own the space that char *s will encroach on, and you will get a seg fault:

int     main()
{
    char *s;
    char *t;//addition
    int i = 0;

    printf("%lu \n", sizeof(s));

    s = malloc(sizeof(char) * 2);
    t = malloc(sizeof(char) * 2);//addition

    printf("%lu \n", sizeof(s));
    /*Why is this working?*/
    while (i <= 5)
    {
      s[i] = 'l';
      i++;
    }
    printf("%s \n", s);

    printf("%lu \n", sizeof(char));

    printf("%lu \n", sizeof(s[0]));
}

Note: In the environment I use, (Windows 7, NI Run-Time, debug, etc) I get a seg-fault either way, which somewhat supports the undefined behavior assertions in other answers.



回答2:

In my opinion, this should segfault as I'm trying to write more than I allocated. Why is this working?

It's not "working"; your code invokes undefined behavior. "Undefined behavior" doesn't mean "your code will segfault." That would be defined behavior. UB means anything can happen.

In this case, you're stomping on memory you don't own. That will sometimes segfault, but don't count on it. C has no notion of "segfaults", that comes from your OS.



回答3:

Segfault is a signal from the OS telling you that accessing a particular memory zone is none of your business. It just so happens that what you're accessing doesn't trigger alarms from the OS's memory management unit. There are tons of ways to exploit that (overriding function calls, attacks on binaries by overwriting stack values etc).

It may also be the case that your malloc doesn't allocate those 2 bytes and 2 bytes only. Malloc invokes a system call that allocates virtual memory pages (which are likely way more than 2 bytes). That syscall (sbrk and VirtualAlloc for Linux and Windows, respectively) tells the OS to map those pages onto what you need, then protect them so that nobody else (read: another process/application) accidentally treads on your memory zone ('cause in that case the OS would hit that one's head with a segfault).

And there's also the undefined behavior thing the others mentioned.



回答4:

it's seems like this: malloc allocates some bytes more than you specify..

when you use malloc(sizeof(s)*2); //8 then while (i <= 36) is ok, but while (i <= 37) already not..

when you use e.g malloc(sizeof(s)*4); //16 then while (i <= 7572) is ok, but while (i <= 7573) already not..

(I tested in code::blocks)

Too bad that Dennis Ritchie is dead, it's still big mystery why it's like that, but just don' worry about it too much just allocate always enough you need and null terminate strings too



回答5:

As others have said, its hit or miss whether the code will cause a runtime error in production since bounds checking is not built into C++ (unlike languages Java or C#). The code will cause an error under a memory checker.

You probably know Valgrind, so that's an exercise left to the reader. Here's the same under Clang's Address Sanitizer (I added a printf("malloc: %p \n", s);):

$ ./t.exe | /usr/local/bin/asan_symbolize.py 
malloc: 0x60200000b3b0 
=================================================================
==98557==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000b3b2 at pc 0x1065c4b5b bp 0x7fff5963b810 sp 0x7fff5963b808
WRITE of size 1 at 0x60200000b3b2 thread T0
    #0 0x1065c4b5a (/Users/jwalton/./t.exe+0x100000b5a)
    #1 0x7fff870e27e0 (/usr/lib/system/libdyld.dylib+0x27e0)
    #2 0x0
0x60200000b3b2 is located 0 bytes to the right of 2-byte region [0x60200000b3b0,0x60200000b3b2)
allocated by thread T0 here:
    #0 0x1065d8cd5 (/usr/local/lib/clang/3.3/lib/darwin//libclang_rt.asan_osx_dynamic.dylib+0xfcd5)
    #1 0x1065c4971 (/Users/jwalton/./t.exe+0x100000971)
    #2 0x7fff870e27e0 (/usr/lib/system/libdyld.dylib+0x27e0)
    #3 0x0
Shadow bytes around the buggy address:
  0x1c0400001620: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c0400001630: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c0400001640: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c0400001650: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c0400001660: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x1c0400001670: fa fa fa fa fa fa[02]fa fa fa 00 00 fa fa fd fa
  0x1c0400001680: fa fa fd fa fa fa 00 00 fa fa fd fa fa fa fd fa
  0x1c0400001690: fa fa 00 00 fa fa 00 00 fa fa fd fa fa fa fd fa
  0x1c04000016a0: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x1c04000016b0: fa fa 00 00 fa fa fd fa fa fa fd fa fa fa 00 00
  0x1c04000016c0: fa fa 00 00 fa fa fd fa fa fa fd fa fa fa 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:     fa
  Heap right redzone:    fb
  Freed heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==98557==ABORTING


标签: c malloc