Is this possible to customize printf?

2019-01-15 08:10发布

问题:

I have some struct that I need to print frequently. For now, I am using a classical print wrapper around this struct :

void printf_mystruct(struct* my_struct)
{
   if (my_struct==NULL) return;
   printf("[value1:%d value2:%d]", struct->value1, struct->value2);
}

This function is handy, but is also really limited. I cannot prepen or append some text without making a new wrapper. I know that I can use va_arg family to be able to prepend or apprend some text, but I feel like I would be re-implementing the wheel.

I am wondering if it's possible to write a customizing function to printf. I would like to be able to write something like this :

register2printf("%mys", &printf_mystruct); 
...
if (incorrect)
  printf("[%l] Struct is incorrect : %mys\n", log_level, my_struct);

Is this possible ? How can I do this ?

NB: I am under Ubuntu Linux 10.04 and I use gcc.

回答1:

Sorry, but some answers are incorrect on Linux with Glibc

On Linux with a GNU Glibc, you can customize printf: you would call register_printf_function to e.g. define the meaning of %Y in your printf format strings.

However, this behavior is Glibc specific, and might even become obsolete... I'm not sure I would recommend this approach!

If coding in C++, the C++ stream library has manipulators which you could extend, and you can also overload for your types the operator << etc.

added in february 2018

You could consider writing a GCC plugin helping that (and improving the typechecking of some extended printf). It won't be easy (probably a few weeks or months of work), and it would be GCC version specific (not the same plugin code for GCC 7 and GCC 8). you might add some specific #pragma to inform your plugin about extra control string specifiers like your %Y and the type expected for them. Your plugin should change the handling of format attribute (perhaps in gcc/tree.c)



回答2:

This is not possible in standard C. You cannot extend printf to add custom format strings. Your helper function approach is probably about as good as you will get within the constraints of C.



回答3:

Unfortunately that's not possible.

Probably the easiest solution would be taking a small printf implementation (e.g. from a libc for embedded systems) and extending it to fit your purposes.



回答4:

No, this is not possible. An alternative is to make your own wrapper around printf() itself. It would parse the format string and process conversions like printf() does. If a conversion is one of your custom conversions, it would print whatever you need, and if not, it would call one of the system's *printf() functions to have it perform the conversion for you.

Note that this is a non-trivial task, and you have to be careful to parse the format string exactly like printf() does. See man 3 printf. You can read the variable argument list using functions in <stdarg.h>.

Once you have such a wrapper, you can make it extensible by employing function pointers (the custom conversions don't have to be hard-coded into the wrapper).



回答5:

You can use the sprintf function to obtain a string representation of your struct:

char* repr_mystruct(char* buffer, struct* my_struct)
{
    sprintf(buffer, "[string:%s value1:%d value2:%d]", struct->value1, struct->value2);
    return buffer;
}

and subsequently print the data to your output stream

char buffer[512]; //However large you need it to be
printf("My struct is: %s", repr_mystruct(buffer, &my_struct))

Edit: Modified the function to allow the passing of a buffer (see discussion below).

Note 2: The format string requires three arguments but in the example only two are passed.



回答6:

Just leave it here:

printf("%s: pid = %lu, ppid = %lu, pgrp = %lu, tpgrp = %lu\n", name, 
        (unsigned long int)getpid(), (unsigned long int)getppid(), (unsigned long int)getpgrp(), 
        (unsigned long int)tcgetpgrp(STDIN_FILENO));


回答7:

Assuming you want portable code, the glibc extensions are out. But even keeping to C99 and POSIX standards it is very much possible, I just wrote one.

You don't have to re-implement printf, you do unfortunately need to make your code smart enough to parse printf format strings, and infer the variadic argument's C types from them.

When variadic arguments are placed on the stack, no type or sizing information is included.

void my_variadic_func(fmt, ...)
{

}

my_variadic_func("%i %s %i", 1, "2", 3);

In the above example on a 64bit system, with 48bit addressing the compiler would likely end up allocating 4bytes + 6bytes + 4byte = 14bytes of stack memory, and packing the values into that. I say likely, because how the memory is allocated and the arguments packed is implementation specific.

That means, in order to access the pointer value for %s in the above string, you need to know that the first argument was of type int, so you can advance your va_list cursor to the right point.

The only way you can get that type information is by looking at the format string, and seeing what type the user specified (in this case %i).

So in order to implement @AmbrozBizjak's suggestion, of passing subfmt strings to printf, you need to parse the fmt string, and after each complete, non-custom fmt specifier, advance a va_list by (however many bytes wide) the fmt type was.

When you hit a custom fmt specifier, your va_list is at the right point to unpack the argument. You can then use va_arg() to get your custom argument (passing the right type), and use it to run whatever code you need to, to produce your custom fmt specifier's output.

You concatenate the output from your previous printf call, and your custom fmt specifier's output, and carry on processing, until you reach the end, at which point you call printf again to process the rest of your format string.

The code is more complex (so I included it below), but that gives you a basic idea of what you have to do.

My code also uses talloc... but you can do it with the standard memory functions, just requires a bit more string wrangling.

char *custom_vasprintf(TALLOC_CTX *ctx, char const *fmt, va_list ap)
{
    char const  *p = fmt, *end = p + strlen(fmt), *fmt_p = p, *fmt_q = p;
    char        *out = NULL, *out_tmp;
    va_list     ap_p, ap_q;

    out = talloc_strdup(ctx, "");
    va_copy(ap_p, ap);
    va_copy(ap_q, ap_p);

    do {

        char        *q;
        char        *custom;
        char        len[2] = { '\0', '\0' };
        long        width = 0, group = 0, precision = 0, tmp;

        if ((*p != '%') || (*++p == '%')) {
            fmt_q = p + 1;
            continue;   /* literal char */
        }

        /*
         *  Check for parameter field
         */
        tmp = strtoul(p, &q, 10);
        if ((q != p) && (*q == '$')) {
            group = tmp;
            p = q + 1;
        }

        /*
         *  Check for flags
         */
        do {
            switch (*p) {
            case '-':
                continue;

            case '+':
                continue;

            case ' ':
                continue;

            case '0':
                continue;

            case '#':
                continue;

            default:
                goto done_flags;
            }
        } while (++p < end);
    done_flags:

        /*
         *  Check for width field
         */
        if (*p == '*') {
            width = va_arg(ap_q, int);
            p++;
        } else {
            width = strtoul(p, &q, 10);
            p = q;
        }

        /*
         *  Check for precision field
         */
        if (*p == '.') {
            p++;
            precision = strtoul(p, &q, 10);
            p = q;
        }

        /*
         *  Length modifiers
         */
        switch (*p) {
        case 'h':
        case 'l':
            len[0] = *p++;
            if ((*p == 'h') || (*p == 'l')) len[1] = *p++;
            break;

        case 'L':
        case 'z':
        case 'j':
        case 't':
            len[0] = *p++;
            break;
        }

        /*
         *  Types
         */
        switch (*p) {
        case 'i':                               /* int */
        case 'd':                               /* int */
        case 'u':                               /* unsigned int */
        case 'x':                               /* unsigned int */
        case 'X':                               /* unsigned int */
        case 'o':                               /* unsigned int */
            switch (len[0]) {
            case 'h':
                if (len[1] == 'h') {                    /* char (promoted to int) */
                    (void) va_arg(ap_q, int);
                } else {
                    (void) va_arg(ap_q, int);           /* short (promoted to int) */
                }
                break;

            case 'L':
                if ((*p == 'i') || (*p == 'd')) {
                    if (len [1] == 'L') {
                        (void) va_arg(ap_q, long);      /* long */
                    } else {
                        (void) va_arg(ap_q, long long);     /* long long */
                    }
                } else {
                    if (len [1] == 'L') {
                        (void) va_arg(ap_q, unsigned long); /* unsigned long */
                    } else {
                        (void) va_arg(ap_q, unsigned long long);/* unsigned long long */
                    }
                }
                break;

            case 'z':
                (void) va_arg(ap_q, size_t);                /* size_t */
                break;

            case 'j':
                (void) va_arg(ap_q, intmax_t);              /* intmax_t */
                break;

            case 't':
                (void) va_arg(ap_q, ptrdiff_t);             /* ptrdiff_t */
                break;

            case '\0':  /* no length modifier */
                if ((*p == 'i') || (*p == 'd')) {
                    (void) va_arg(ap_q, int);           /* int */
                } else {
                    (void) va_arg(ap_q, unsigned int);      /* unsigned int */
                }
            }
            break;

        case 'f':                               /* double */
        case 'F':                               /* double */
        case 'e':                               /* double */
        case 'E':                               /* double */
        case 'g':                               /* double */
        case 'G':                               /* double */
        case 'a':                               /* double */
        case 'A':                               /* double */
            switch (len[0]) {
            case 'L':
                (void) va_arg(ap_q, long double);           /* long double */
                break;

            case 'l':   /* does nothing */
            default:    /* no length modifier */
                (void) va_arg(ap_q, double);                /* double */
            }
            break;

        case 's':
            (void) va_arg(ap_q, char *);                    /* char * */
            break;

        case 'c':
            (void) va_arg(ap_q, int);                   /* char (promoted to int) */
            break;

        case 'p':
            (void) va_arg(ap_q, void *);                    /* void * */
            break;

        case 'n':
            (void) va_arg(ap_q, int *);                 /* int * */
            break;

        /*
         *  Custom types
         */
        case 'v':
        {
            value_box_t const *value = va_arg(ap_q, value_box_t const *);

            /*
             *  Allocations that are not part of the output
             *  string need to occur in the NULL ctx so we don't fragment
             *  any pool associated with it.
             */
            custom = value_box_asprint(NULL, value->type, value->datum.enumv, value, '"');
            if (!custom) {
                talloc_free(out);
                return NULL;
            }

        do_splice:
            /*
             *  Pass part of a format string to printf
             */
            if (fmt_q != fmt_p) {
                char *sub_fmt;

                sub_fmt = talloc_strndup(NULL, fmt_p, fmt_q - fmt_p);
                out_tmp = talloc_vasprintf_append_buffer(out, sub_fmt, ap_p);
                talloc_free(sub_fmt);
                if (!out_tmp) {
                oom:
                    fr_strerror_printf("Out of memory");
                    talloc_free(out);
                    talloc_free(custom);
                    va_end(ap_p);
                    va_end(ap_q);
                    return NULL;
                }
                out = out_tmp;

                out_tmp = talloc_strdup_append_buffer(out, custom);
                TALLOC_FREE(custom);
                if (!out_tmp) goto oom;
                out = out_tmp;

                va_end(ap_p);       /* one time use only */
                va_copy(ap_p, ap_q);    /* already advanced to the next argument */
            }

            fmt_p = p + 1;
        }
            break;

        case 'b':
        {
            uint8_t const *bin = va_arg(ap_q, uint8_t *);

            /*
             *  Only automagically figure out the length
             *  if it's not specified.
             *
             *  This allows %b to be used with stack buffers,
             *  so long as the length is specified in the format string.
             */
            if (precision == 0) precision = talloc_array_length(bin);

            custom = talloc_array(NULL, char, (precision * 2) + 1);
            if (!custom) goto oom;
            fr_bin2hex(custom, bin, precision);

            goto do_splice;
        }

        default:
            break;
        }
        fmt_q = p + 1;
    } while (++p < end);

    /*
     *  Print out the rest of the format string.
     */
    if (*fmt_p) {
        out_tmp = talloc_vasprintf_append_buffer(out, fmt_p, ap_p);
        if (!out_tmp) goto oom;
        out = out_tmp;
    }

    va_end(ap_p);
    va_end(ap_q);

    return out;
}

EDIT:

It's probably worth doing what the Linux folks do and overloading %p to make new format specifiers, i.e. %pA %pB. This means the static printf format checks don't complain.