Macro directives in C, my code example doesn't

2019-07-14 03:46发布

问题:

I want to get the following code piece work:

#define READIN(a, b) if(scanf('"#%d"', '"&a"') != 1) { printf("ERROR"); return EXIT_FAILURE; }

int main(void)
{
    unsigned int stack_size;
    printf("Type in size: ");
    READIN(d, stack_size);
}

I don't get it, how to use directives with the # operator. I want to use the scanf with print ERROR etc. several times, but the "'"#%d"' & '"&a"'" is I think completely wrong. Is there any way to get that running? I think a macro is the best solution or not?

回答1:

You should only stringify arguments to the macro, and they must be outside of strings or character constants in the replacement text of the macro. Thus you probably should use:

#define READIN(a, b) do { if (scanf("%" #a, &b) != 1) \
                          { fprintf(stderr, "ERROR\n"); return EXIT_FAILURE; } \
                     } while (0)

int main(void)
{
    unsigned int stack_size;
    printf("Type in size: ");
    READIN(u, stack_size);
    printf("You entered %u\n", stack_size);
    return(0);
}

There are many changes. The do { ... } while (0) idiom prevents you from getting compilation errors in circumstances such as:

if (i > 10)
    READIN(u, j);
else
    READIN(u, k);

With your macro, you'd get an unexpected keyword 'else' type of message because the semi-colon after the first READIN() would be an empty statement after the embedded if, so the else could not belong to the visible if or the if inside the macro.

The type of stack_size is unsigned int; the correct format specifier, therefore, is u (d is for a signed int).

And, most importantly, the argument a in the macro is stringized correctly (and string concatenation of adjacent string literals - an extremely useful feature of C89! - takes care of the rest for you. And the argument b in the macro is not embedded in a string either.

The error reporting is done to stderr (the standard stream for reporting errors on), and the message ends with a newline so it will actually appear. I didn't replace return EXIT_FAILURE; with exit(EXIT_FAILURE);, but that would probably be a sensible choice if the macro will be used outside of main(). That assumes that 'terminate on error' is the appropriate behaviour in the first place. It often isn't for interactive programs, but fixing it is a bit harder.

I'm also ignoring my reservations about using scanf() at all; I usually avoid doing so because I find error recovery too hard. I've only been programming in C for about 28 years, and I still find scanf() too hard to control, so I essentially never use it. I typically use fgets() and sscanf() instead. Amongst other merits, I can report on the string that caused the trouble; that's hard to do when scanf() may have gobbled some of it.


My thought with scanf() here is, to only read in positive numbers and no letters. My overall code does create a stack, which the user types in and the type should be only positive, otherwise error. [...] I only wanted to know if there's a better solution to forbid the user to type in something other than positive numbers?

I just tried the code above (with #include <stdlib.h> and #include <stdio.h> added) and entered -2 and got told 4294967294, which isn't what I wanted (the %u format does not reject -2, at least on MacOS X 10.7.2). So, I would go with fgets() and strtoul(), most likely. However, accurately detecting all possible problems with strtoul() is an exercise of some delicacy.

This is the alternative code I came up with:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>
#include <string.h>

int main(void)
{
    unsigned int stack_size = 0;
    char buffer[4096];
    printf("Type in size: ");
    if (fgets(buffer, sizeof(buffer), stdin) == 0)
        printf("EOF or error detected\n");
    else
    {
        char *eos;
        unsigned long u;
        size_t len = strlen(buffer);
        if (len > 0)
            buffer[len - 1] = '\0';  // Zap newline (assuming there is one)
        errno = 0;
        u = strtoul(buffer, &eos, 10);
        if (eos == buffer ||
            (u == 0 && errno != 0) ||
            (u == ULONG_MAX && errno != 0) ||
            (u > UINT_MAX))
        {
            printf("Oops: one of many problems occurred converting <<%s>> to unsigned integer\n", buffer);
        }
        else
            stack_size = u;
        printf("You entered %u\n", stack_size);
    }
    return(0);
}

The specification of strtoul() is given in ISO/IEC 9899:1999 §7.20.1.4:

¶1 [...]

unsigned long int strtoul(const char * restrict nptr,
char ** restrict endptr, int base);

[...]

¶2 [...] First, they decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters (as specified by the isspace function), a subject sequence resembling an integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters, including the terminating null character of the input string. Then, they attempt to convert the subject sequence to an integer, and return the result.

¶3 [...]

¶4 The subject sequence is defined as the longest initial subsequence of the input string, starting with the first non-white-space character, that is of the expected form. The subject sequence contains no characters if the input string is empty or consists entirely of white space, or if the first non-white-space character is other than a sign or a permissible letter or digit.

¶5 If the subject sequence has the expected form and the value of base is zero, the sequence of characters starting with the first digit is interpreted as an integer constant according to the rules of 6.4.4.1. If the subject sequence has the expected form and the value of base is between 2 and 36, it is used as the base for conversion, ascribing to each letter its value as given above. If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type). A pointer to the final string is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

¶6 [...]

¶7 If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of nptr is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

Returns

¶8 The strtol, strtoll, strtoul, and strtoull functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

The error I got was from a 64-bit compilation where -2 was converted to a 64-bit unsigned long, and that was outside the range acceptable to a 32-bit unsigned int (the failing condition was u > UINT_MAX). When I recompiled in 32-bit mode (so sizeof(unsigned int) == sizeof(unsigned long)), then the value -2 was accepted again, interpreted as 4294967294 again. So, even this is not delicate enough...you probably have to do a manual skip of leading blanks and reject a negative sign (and maybe a positive sign too; you'd also need to #include <ctype.h> too):

        char *bos = buffer;
        while (isspace(*bos))
            bos++;
        if (!isdigit(*bos))
            ...error - not a digit...
        char *eos;
        unsigned long u;
        size_t len = strlen(bos);
        if (len > 0)
            bos[len - 1] = '\0';  // Zap newline (assuming there is one)
        errno = 0;
        u = strtoul(bos, &eos, 10);
        if (eos == bos ||
            (u == 0 && errno != 0) ||
            (u == ULONG_MAX && errno != 0) ||
            (u > UINT_MAX))
        {
            printf("Oops: one of many problems occurred converting <<%s>> to unsigned integer\n", buffer);
        }

As I said, the whole process is rather non-trivial.

(Looking at it again, I'm not sure whether the u == 0 && errno != 0 clause would ever catch any errors...maybe not because the eos == buffer (or eos == bos) condition catches the case there's nothing to convert at all.)



回答2:

You are incorrectly encasing your macro argument(s), it should look like:

#define READIN(a, b) if(scanf("%"#a, &b) != 1) { printf("ERROR"); return EXIT_FAILURE; }

you use of the stringify operator was also incorrect, it must directly prefix the argument name.

In short, use "%"#a, not '"#%d"', and &b, not '"&a"'.

as a side note, for longish macro's like those, it helps to make them multi-line using \, this keeps them readable:

#define READIN(a, b) \
if(scanf("%"#a, &b) != 1) \
{ \
    printf("ERROR"); \
    return EXIT_FAILURE; \
}

When doing something like this, one should preferably use a function, something along the lines of this should work:

inline int readIn(char* szFormat, void* pDst)
{
    if(scanf(szFormat,pDst) != 1)
    {
        puts("Error");
        return 0;
    }

    return 1;
}

invoking it would be like so:

if(!readIn("%d",&stack_size))
    return EXIT_FAILURE;


回答3:

scanf(3) takes a const char * as a first argument. You are passing '"..."', which is not a C "string". C strings are written with the " double quotes. The ' single quotes are for individual characters: 'a' or '\n' etc.

Placing a return statement inside a C preprocessor macro is usually considered very poor form. I've seen goto error; coded inside preprocessor macros before for repetitive error handling code when storing formatted data to and reading data from a file or kernel interface, but these are definitely exceptional circumstances. You would detest debugging this in six months time. Trust me. Do not hide goto, return, break, continue, inside C preprocessor macros. if is alright so long as it is entirely contained within the macro.

Also, please get in the habit of writing your printf(3) statements like this:

printf("%s", "ERROR");

Format string vulnerabilities are exceedingly easy to write. Your code does not contain any such vulnerability now, but trust me, at some point in the future those strings are inevitably modified to include some user-supplied content, and putting in an explicit format string now will help prevent these in the future. At least you'll think about it in the future if you see this.

It is considered polite to wrap your multi-line macros in do { } while (0) blocks.

Finally, the stringification is not quite done correctly; try this instead:

#define READIN(A, B) do { if (scanf("%" #A, B) != 1) { \
        /* error handling */ \
    } else { \
        /* success case */ \
    } } while(0)

Edit: I feel I should re-iterate akappa's advice: Use a function instead. You get better type checking, better backtraces when something goes wrong, and it is far easier to work with. Functions are good.