Why this method for function overloading in C work

2019-01-29 01:47发布

问题:

I've looked over some ways of doing it in C but i've only found for C99.

But i've come across the solution below, taken from Lock Less.

The thing is, I don't quite understand how it works and would like know the fundamentals of what is going on there to be able to understand it more clearly.

I've fetched the web for a while and found this about __VA_ARGS__, but that alone wasn't enough unfortunately.

I would really appreciate an explanation or some guidance about this matter, any kind of reference would help.

I've compiled this code with GCC-5.4.1 with -ansi flag.

#include <stdarg.h>
#include <stdlib.h>
#include <stdio.h>

#define COUNT_PARMS2(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _, ...) _
#define COUNT_PARMS(...)\
    COUNT_PARMS2(__VA_ARGS__, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)

void count_overload1(int p1)
{
    printf("One param: %d\n", p1);
}

void count_overload2(double *p1, const char *p2)
{
    printf("Two params: %p (%f) %s\n", p1, *p1, p2);
}

void count_overload3(int p1, int p2, int p3)
{
    printf("Three params: %c %d %d\n", p1, p2, p3);
}

void count_overload_aux(int count, ...)
{
    va_list v;
    va_start(v, count);

    switch(count)
    {
        case 1:
        {
            int p1 = va_arg(v, int);
            count_overload1(p1);
            break;
        }

        case 2:
        {
            double *p1 = va_arg(v, double *);
            const char *p2 = va_arg(v, const char *);
            count_overload2(p1, p2);
            break;
        }

        case 3:
        {
            int p1 = va_arg(v, int);
            int p2 = va_arg(v, int);
            int p3 = va_arg(v, int);
            count_overload3(p1, p2, p3);
            break;
        }

        default:
        {
            va_end(v);

            printf("Invalid arguments to function 'count_overload()'");
            exit(1);
        }
    }

    va_end(v);
}
#define count_overload(...)\
    count_overload_aux(COUNT_PARMS(__VA_ARGS__), __VA_ARGS__)


int main(int argc, char const *argv[])
{
    double d = 3.14;
    count_overload(1);
    count_overload(&d, "test");
    count_overload('a',2,3);
    return 0;
}

The output is:

One param: 1
Two params: 0x7ffc0fbcdd30 (3.140000) test
Three params: a 2 3

回答1:

Let's break down the COUNT_PARMS and COUNT_PARMS2 macros. First COUNT_PARMS:

#define COUNT_PARMS(...)\
    COUNT_PARMS2(__VA_ARGS__, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)

Since the macro contains no named arguments, any parameters passsed to it are put in place of __VA_ARGS__.

So the following calls:

COUNT_PARMS(arg1)
COUNT_PARMS(arg1, arg2)
COUNT_PARMS(arg1, arg2, ,arg3)

Will expand to:

COUNT_PARMS2(arg1,   10,    9,  8, 7, 6, 5, 4, 3, 2, 1)
COUNT_PARMS2(arg1, arg2,   10,  9, 8, 7, 6, 5, 4, 3, 2, 1)
COUNT_PARMS2(arg1, arg2, arg3, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
                                                  // x

I spaced out the arguments so you can see which ones correspond to each other. Make special note of the column marked x. This is the number of parameters passed to COUNT_PARMS, and it's the 11th argument in each case.

Now let's look at COUNT_PARMS2:

#define COUNT_PARMS2(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _, ...) _

There are 11 names arguments, plus ... to account for any additional arguments. The entire body of the macro is _, which is the name of the 11th argument. So the purpose of this macro is to take 11 or more arguments and replace them with just the 11th argument.

Looking again at the definition of COUNT_PARAMS, it expands in such a way that it calls COUNT_PARMS2 with the 11th parameter being the number of parameters passed to COUNT_PARAMS. This is how the magic happens.

Now looking at the function calls in main:

count_overload(1);
count_overload(&d, "test");
count_overload('a',2,3);

These expand to this:

count_overload_aux(COUNT_PARMS(1), 1);
count_overload_aux(COUNT_PARMS(&d, "test"), &d, "test");
count_overload_aux(COUNT_PARMS('a',2,3), 'a',2,3);

Then this:

count_overload_aux(COUNT_PARMS2(1, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1), 1);
count_overload_aux(COUNT_PARMS2(&d, "test", 10, 9, 8, 7, 6, 5, 4, 3, 2, 1), &d, "test");
count_overload_aux(COUNT_PARMS2('a',2,3, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1), 'a',2,3);

Then this:

count_overload_aux(1, 1);
count_overload_aux(2, &d, "test");
count_overload_aux(3, 'a',2,3);

The end result is that you can call a function that takes a variable number of arguments without having to explicitly say how many there are.



回答2:

dbush's great answer explains what the macros are doing. I'd like to expand on this and talk about the ellipsis ... which is used here. You say that reading about the variadic macros and __VA_ARGS__ didn't help, so I presume that you might not understand C ellipsis too well either.

In C a way to declare a function that takes an variable number of arguments is to use the ellipsis .... A prime example of such a function is printf, which can takes at least one parameter, but it accepts many more.

The prototype of printf is:

int printf(const char *format, ...);

The ... is used for declaring the ellipsis. Note that the ... can only appear at the end of the named arguments and it shouldn't be a register variable, a function or an array type, hence:

void foo(...)
{
}

is invalid, the compiler would show you an error like this:

c.c:6:10: error: ISO C requires a named argument before ‘...’
 void foo(...)
          ^~~

So, how do use this? You use va_list defined in stdarg.h

#include<stdio.h>
#include<stdarg.h>

int sum(int num_of_values, ...)
{
    va_list ap;

    // use the last named argument
    va_start(ap, num_of_values);

    int s = 0;
    for(int i = 0; i < num_of_values; ++i)
    {
        int v = va_arg(ap, int);
        s += v;
    }

    va_end(ap);

    return s;
}

int main(void)
{
    printf("The sum is: %d\n", sum(5, 1, 2, 3, 4, 5));
}

which will output The sum is: 15.

So when your function has an ellipsis, you must first declare a variable of type va_list and call va_start with that variable as the first argument and the last named argument as the second argument.

Then you can fetch the values by using va_arg(ap, <type>), where <type> is the type of the value, in case of the example above, it would be int. Functions like printf parse the format and use the conversion specifiers to get the correct type. When printf founds an %d, it will do va_arg(ap, int), if %f is found it would do va_arg(ap, float) and if %s is found, it would do va_arg(ap, char*) and so on. That's why printf has undefined behaviour when the format and the arguments don't match, because a wrong type would be used in the va_arg call which messes with subsequent calls of va_arg. At the end va_end must be called.

For a micro kernel that I had to write during my days in the university, I had to implement these va_*-macros. I used the behaviour of the compiler that it put all arguments in the stack frame, so my va_start calculated the address in that stack of the next value after the last named argument. va_arg moved through the stack based on va_start's calculation plus an offset determined by the type while also updating the ap variable with the last consumed argument. It was tricky to get it to work, but at the end it worked on that system, however the same implementation on a x86_64 produces only garbage.

How exactly this is implemented for example in the GCC compiler, I don't know, but I suspect that GCC does something similar. I've checked the source code gcc/builtins.c:4855 but as usual, I find the GCC code to be very complicated to follow.