How to write and read (including spaces) from text

2019-08-07 06:32发布

问题:

I'm using fscanf and fprintf.

I tried to delimit the strings on each line by \t and to read it like so:

fscanf(fp,"%d\t%s\t%s",&t->num,&t->string1,&t->string2);

The file contents:

1[TAB]string1[TAB]some string[NEWLINE]

It does not read properly. If I printf("%d %s %s",t->num,t->string1,t->string2) I get:

1 string1 some

Also I get this compile warning:

warning: format specifies type 'char *' but the argument has type 'char (*)[15]' [-Wformat]

How can I fix this without using binary r/w?

回答1:

I'm guessing the space in "some string" is the problem. fscanf() reading a string using %s stops at the first whitespace character. To include spaces, use something like:

fscanf(fp, "%d\t%[^\n\t]\t%[^\n\t]", &t->num, &t->string1, &t->string2);

See also a reference page for fscanf() and/or another StackOverflow thread on reading tab-delimited items in C.

[EDIT in response to your edit: You seem to also have a problem with the arguments you're passing into fscanf(). You will need to post the declarations of t->string1 to be sure, but it looks like string1 is an array of characters, and therefore you should remove the & from the fscanf() call...]



回答2:

The %s conversion specification stops reading at the first white space, and tabs and blanks both count as white space.

If you want to read a string of non-tabs, you can use a 'scan set' conversion specifier:

if (fscanf(fp, "%d\t%[^\t\n]\t%[^\t\n]", &t->num, t->string1, t->string2) != 3)
    ...oops - format error in input data...

(I'd lay odds that omitting the & from the string arguments is correct.) The question was edited; I win. Dropping the & is necessary to avoid the compiler warning!

This still doesn't quite do what you expect. If there are blanks at the start of the second field, they'll be eaten by the \t in the format string. Any white space in the format string eats any white space (including newlines) in the input. The %[^\t] conversion specification won't get started until there's a character that isn't white space in the input. I'm also assuming you want your input limited by newlines. You can leave out the \n characters if you prefer.

Note that I checked that the fscanf() interpreted 3 fields. It is important to error check your inputs.

If you really want control, you should probably read whole lines with fgets() and then use sscanf() to parse the data.


About fgets() and sscanf(); can you expand about how it will give more control?

Suppose the input data is written

1234



a string with spaces



another string

spread out over multiple lines like that. With raw fscanf(), this will be acceptable input even though it is spread over 9 lines of input. With fgets(), you can read a single line, and then analyze it with sscanf(), and you'll know that the first line was not in the correct format. You can then decide what to do about it.

Also, since mafso called me on it in his comment, we should ensure that there are no buffer overflows by limiting the size of the strings that the scan sets match.

if (fscanf(fp, "%d\t%14[^\t\n]\t%14[^\t\n]", &t->num, t->string1, t->string2) != 3)
    ...oops - format error in input data...

I'm using the error message about char (*)[15] to deduce that 14 is the correct number to use. Note that unlike printf(), you can't specify the sizes via * notation (in the scanf()-family, * supresses assignment), so you have to create the format with the correct sizes. Further, the size you specify is the number of characters before the terminating null byte, so if the array is of size 15, the size you specify in the format string is 14, as shown.