Format string with multiple percent signs

2019-06-22 05:41发布

问题:

I know %% is used to escape actual % signs in a string, so %%%ds will end up with %10s in the following format string, but I don't know why I need %%5s in this string?

After all, there are only two additional arguments (BUFFSIZE / 10).

#define BUFFSIZE 100
char buf[100]={0}
sprintf(buf, "%%5s %%%ds %%%ds", BUFFSIZE / 10, BUFFSIZE / 10);

After running the code above, the buf will contain the string,

%10s %10s 

回答1:

The purpose is to get a format string to use it in another function that needs a format string like sscanf().

With your code you get: %5s %10s %10s written to your buf, see online, which means it will accept three strings with a length identifier.

%%5s          --> %5s
%%%ds with 10 --> %10s (read it that way: {%%}{%d}{s})

That buffer %5s %10s %10s could now be used in a sscanf() call like shown here.

But there is a best practice to prevent a buffer overflow caused by sscanf() which is also described by Kernighan and Pike in their book The Practice of Programming, see here on SO.


The reason why you maybe can't use %*s may be, see here on SO:

For printf, the * allows you to specify minimum field width through an extra parameter, i.e. printf("%*d", 4, 100); specifies a field width of 4.

For scanf, the * indicates that the field is to be read but ignored, so that i.e. scanf("%*d %d", &i) for the input "12 34" will ignore 12 and read 34 into the integer i.



回答2:

% in itself is a valid conversion specifier. The prescribed syntax is, as mentioned in C11, chapter §7.21.6.1/P2, (emphasis mine)

Each conversion specification is introduced by the character %. After the %, the following appear in sequence:

  • Zero or more flags [...]

  • An optional minimum field width.

  • An optional precision [...]

  • An optional length modifier [...]

  • A conversion specifier character that specifies the type of conversion to be applied.

Then, from P8, for conversion specifiers

The conversion specifiers and their meanings are:

......

%

A % character is written. No argument is converted. The complete conversion specification shall be %%.

So, based on greedy approach, compiler will group a syntax like

 ....   %%%ds, BUFFSIZE / 10 ....

as

 {%%}{%d}{s}
  ^^--------------------------Replaced as %
      ^^----------------------Actual conversion specification happens, argument is used
         ^^------------------just part of final output

which finally produces

  %Xs    //where X is the value of (BUFFSIZE / 10)

which is a valid format string (%, minimum field width, conversion specifier, all in order), again, to be used later.



回答3:

The OP is computing a format string from parametric sizes. Given the arguments, the string will contain %5s %10s %10s, which can be used with either printf or scanf:

printf("%5s %10s %10s", "A", "B", "C");

outputs:

    A          B          C

char a[6], b[11], c[11];
scanf("%5s %10s %10s", a, b, c);

will read 3 strings into a, b, c but limits the number of characters read for each string to prevent buffer overflow.

Note however that for the printf case, it is not necessary to compute the string as posted, since you could have used:

printf("%5s %*s %*s", "A", BUFFSIZE / 10, "B", BUFFSIZE / 10, "C");

Unfortunately, scanf() attaches different semantics to the * format modifier and it is not possible to specify the maximum number of characters to store with an argument, only with digits in the format string, hence need for a separate formatting step.