When main is defined without parameters, will argc

2020-02-25 05:29发布

Consider the very simple:

int main(void) {
    return 0;
}

I compiled it (with mingw32-gcc) and executed it as main.exe foo bar.

Now, I had expected some sort of crash or error caused by a main function explicitly declared as being bereft of life parameters. The lack of errors led to this question, which is really four questions.

  • Why does this work? Answer: Because the standard says so!

  • Are the input parameters just ignored or is the stack prepared with argc & argv silently? Answer: In this particular case, the stack is prepared.

  • How do I verify the above? Answer: See rascher's answer.

  • Is this platform dependant? Answer: Yes, and no.

标签: c mingw
6条回答
姐就是有狂的资本
2楼-- · 2020-02-25 05:29

From the C99 standard:

5.1.2.2.1 Program startup

The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

int main(void) { /* ... */ }

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

int main(int argc, char *argv[]) { /* ... */ }

or equivalent; or in some other implementation-defined manner.

查看更多
淡お忘
3楼-- · 2020-02-25 05:34

I don't know the cross-platform answer to your question. But this made me curious. So what do we do? Look at the stack!

For the first iteration:

test.c

int main(void) {
   return 0;
}

test2.c

int main(int argc, char *argv[]) {
   return 0;
}

And now look at the assembly output:

$ gcc -S -o test.s test.c 
$ cat test.s 
        .file   "test.c"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        movl    $0, %eax
        popl    %ebp
        ret
        .size   main, .-main
        .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
        .section        .note.GNU-stack,"",@progbits

Nothing exciting here. Except for one thing: both C programs have the same assembly output!

This basically makes sense; we never really have to push/pop anything off of the stack for main(), since it's the first thing on the call stack.

So then I wrote this program:

int main(int argc, char *argv[]) {
   return argc;
}

And its asm:

main:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        popl    %ebp
        ret

This tells us that "argc" is located at 8(%ebp)

So now for two more C programs:

int main(int argc, char *argv[]) {
__asm__("movl    8(%ebp), %eax\n\t"
        "popl    %ebp\n\t"
        "ret");
        /*return argc;*/
}


int main(void) {
__asm__("movl    8(%ebp), %eax\n\t"
        "popl    %ebp\n\t"
        "ret");
        /*return argc;*/
}

We've stolen the "return argc" code from above and pasted it into the asm of these two programs. When we compile and run these, and then invoke echo $? (which echos the return value of the previous process) we get the "right" answer. So when I run "./test a b c d" then $? gives me "5" for both programs - even though only one has argc/argv defined. This tells me that, on my platform, argc is for sure placed on the stack. I'd bet that a similar test would confirm this for argv.

Try this on Windows!

查看更多
做个烂人
4楼-- · 2020-02-25 05:40
  1. Why it works: Generally, function arguments are passed in specific places (registers or stack, usually). A function without arguments will never check them, so their contents are irrelevant. This depends on calling and naming conventions, but see #4.

  2. The stack will typically be prepared. On platforms where argv is parsed by the runtime library, such as DOS, the compiler may choose not to link in the code if nothing uses argv, but that is complexity few deem necessary. On other platforms, argv is prepared by exec() before your program is even loaded.

  3. Platform dependent, but on Linux systems, for instance, you can in fact examine the argv contents in /proc/PID/cmdline whether or not they're used. Many platforms also provide separate calls to find arguments.

  4. As per the standard quoted by Tim Schaeffer, main does not need to accept the arguments. On most platforms, the arguments themselves will still exist, but a main() without arguments will never know of them.

查看更多
女痞
5楼-- · 2020-02-25 05:46

In most compilers, __argc and __argv exist as global variables from the runtime library. The values will be correct.

On windows, they won't be correct if the entry point has UTF-16 signature, which is also the only way of getting the right command arguments on that platform. They will be empty in that case, but this is not your case, and there're two widechar alternative variables.

查看更多
男人必须洒脱
6楼-- · 2020-02-25 05:48

There are some notes to do.

The standard basically says what most likely main is: a function taking no arguments, or a function taking two arguments, or whatever else!

See for example my answer to this question.

But your question points to other facts.

Why does this work? Answer: Because the standard says so!

It is not correct. It works for other reasons. It works because of the calling conventions.

These convention can be: arguments are pushed on stack, and the caller is responsible for cleaning the stack. Because of this, in actual asm code, the callee can totally ignore what is on the stack. A call looks like

   push value1
   push value2
   call function
   add esp, 8

(intel examples, just to stay in the mainstream).

What function does with the arguments pushed on stack, is totally uninteresting, everything will still work fine! And this is indeed true even if the calling convention are different, e.g.

   li  $a0, value
   li  $a1, value
   jal function

If function takes into account the registers $a0 and $a1 or not, does not change anything.

So callee can ignore without harms arguments, cn believe they do not exist, or it can know they exist, but prefer to ignore them (on the contrary, it would be problematic if the callee gets values from the stack or registers, while the caller passed nothing).

This is why things work.

From the C point of view, if we are on a system where the startup code calls the main with two arguments (int and char **) and expect an int return value, the "right" prototype would be

 int main(int argc, char **argv) { }

But let us suppose now that we do not use these arguments.

It is more correct to say int main(void) or int main() (still in the same system where the implementation calls the main with two args and expect an int return value, as said before)?

Indeed standard does not say what we have to do. The correct "prototype" that says that we have two arguments is still the one shown before.

But from a logical point of view, the right way of saying that there are arguments (we know it) but we are not interested in them, is

 int main() { /* ... */ }

In this answer I've shown what it happens if we pass arguments to a function declared as int func() and what happens if we pass arguments to a function declared as int func(void).

In the second case we have an error since (void) explicitly says the function has no arguments.

With main we can't get an error since we have no a real prototype mandating for arguments, but it is worth noting that gcc -std=c99 -pedantic gives no warning for int main() nor for int main(void), and this would mean that 1) gcc is not C99 compliant even with the std flag, or 2) both ways are standard compliant. More likely it is the option 2.

One is explicitly standard compliant (int main(void)), the other is indeed int main(int argc, char **argv), but without explicitly saying the arguments, since we are not interested in them.

int main(void) works even when arguments exist, because of what I've written before. But it states that main takes no argument. While in many cases, if we can write int main(int argc, char **argv), then it is false, and int main() must be preferred instead.

Another interesting thing to notice is that if we say main does not return a value (void main()) on a system where the implementation expects a return value, we obtain a warning. This is because the caller expect it to do something with it, so that it is "undefined behaviour" if we do not return a value (which it does not mean putting an explicit return in the main case, but declaring main as returning an int).

In many startup codes I've seen the main is called in one of these ways:

  retval = main(_argc, _argv);
  retval = main(_argc, _argv, environ);
  retval = main(_argc, _argv, environ, apple); // apple specific stuff

But there can exist startup codes that calls main differently, e.g. retval = main(); in this case, to show this, we can use int main(void), and on the other hand, using int main(int argc, char **argv) would compile, but make the program crash if we actually use the arguments (since the retrieved values will be rubbish).

Is this platform dependant?

The way the main is called is platform dependent (implementation specific), as allowed by standards. The "supposed" main prototype is a conseguence and as already said, if we know there are arguments passed in but we shall not use them, we should use int main(), as a short-form for longer int main(int argc, char **argv), while int main(void) means something different: i.e. main takes no arguments (that is false in the system we are thinking about)

查看更多
放荡不羁爱自由
7楼-- · 2020-02-25 05:50

In classic C, you can do something similar:

void f() {}

f(5, 6);

There is nothing stopping you from calling a function with a different number of parameters as its definition assumes. (Modern compilers, naturally, consider this an egregious error and will strongly resist actually compiling the code.)

The same thing happens with your main() function. The C runtime library will call

main(argc, argv);

but the fact that your function is not prepared to receive those two arguments is of no consequence to the caller.

查看更多
登录 后发表回答