Why do compilers not warn about out-of-bounds stat

2019-01-11 06:56发布

A colleague of mine recently got bitten badly by writing out of bounds to a static array on the stack (he added an element to it without increasing the array size). Shouldn't the compiler catch this kind of error? The following code compiles cleanly with gcc, even with the -Wall -Wextra options, and yet it is clearly erroneous:

int main(void)
{
  int a[10];
  a[13] = 3;  // oops, overwrote the return address
  return 0;
}

I'm positive that this is undefined behavior, although I can't find an excerpt from the C99 standard saying so at the moment. But in the simplest case, where the size of an array is known as compile time and the indices are known at compile time, shouldn't the compiler emit a warning at the very least?

10条回答
仙女界的扛把子
2楼-- · 2019-01-11 07:17

It's not a static array.

Undefined behavior or not, it's writing to an address 13 integers from the beginning of the array. What's there is your responsibility. There are several C techniques that intentionally misallocate arrays for reasonable reasons. And this situation is not unusual in incomplete compilation units.

Depending on your flag settings, there are a number of features of this program that would be flagged, such as the fact that the array is never used. And the compiler might just as easily optimize it out of existence and not tell you - a tree falling in the forest.

It's the C way. It's your array, your memory, do what you want with it. :)

(There are any number of lint tools for helping you find this sort of thing; and you should use them liberally. They don't all work through the compiler though; Compiling and linking are often tedious enough as it is.)

查看更多
爷、活的狠高调
3楼-- · 2019-01-11 07:19

GCC does warn about this. But you need to do two things:

  1. Enable optimization. Without at least -O2, GCC is not doing enough analysis to know what a is, and that you ran off the edge.
  2. Change your example so that a[] is actually used, otherwise GCC generates a no-op program and has completely discarded your assignment.

.

$ cat foo.c 
int main(void)
{
  int a[10];
  a[13] = 3;  // oops, overwrote the return address
  return a[1];
}
$ gcc -Wall -Wextra  -O2 -c foo.c 
foo.c: In function ‘main’:
foo.c:4: warning: array subscript is above array bounds

BTW: If you returned a[13] in your test program, that wouldn't work either, as GCC optimizes out the array again.

查看更多
来,给爷笑一个
4楼-- · 2019-01-11 07:24

The reason C doesn't do it is that C doesn't have the information. A statement like

int a[10];

does two things: it allocates sizeof(int)*10 bytes of space (plus, potentially, a little dead space for alignment), and it puts an entry in the symbol table that reads, conceptually,

a : address of a[0]

or in C terms

a : &a[0]

and that's all. In fact, in C you can interchange *(a+i) with a[i] in (almost*) all cases with no effect BY DEFINITION. So your question is equivalent to asking "why can I add any integer to this (address) value?"

* Pop quiz: what is the one case in this this isn't true?

查看更多
We Are One
5楼-- · 2019-01-11 07:29

You're right, the behavior is undefined. C99 pointers must point within or just one element beyond declared or heap-allocated data structures.

I've never been able to figure out how the gcc people decide when to warn. I was shocked to learn that -Wall by itself will not warn of uninitialized variables; at minimum you need -O, and even then the warning is sometimes omitted.

I conjecture that because unbounded arrays are so common in C, the compiler probably doesn't have a way in its expression trees to represent an array that has a size known at compile time. So although the information is present at the declaration, I conjecture that at the use it is already lost.

I second the recommendation of valgrind. If you are programming in C, you should run valgrind on every program, all the time until you can no longer take the performance hit.

查看更多
成全新的幸福
6楼-- · 2019-01-11 07:29

There are some extension in gcc for that (from compiler side) http://www.doc.ic.ac.uk/~awl03/projects/miro/

on the other hand splint, rat and quite a few other static code analysis tools would have found that.

You also can use valgrind on your code and see the output. http://valgrind.org/

another widely used library seems to be libefence

It's simply a design decision ones made. Which now leads to this things.

Regards Friedrich

查看更多
倾城 Initia
7楼-- · 2019-01-11 07:29

-fbounds-checking option is available with gcc.

worth going thru this article http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html

'le dorfier' has given apt answer to your question though, its your program and it is the way C behaves.

查看更多
登录 后发表回答