sscanf function changes the content of another str

2019-03-02 18:58发布

I am having problems reading strings with sscanf. I have dumbed down the code to focus on the problem. Below is a function in the whole code that is supposed to open a file and read something. But sscanf is acting strangely. For instance I declare a string called atm with the content 'ATOM'. Before the sscanf it prints this string as ATOM while after it is null. What could be the problem? I assume it must be an allocation problem but I could not find it. I tried some suggestions on other topics like replacing %s with other things but it did not help.

 void Get (struct protein p, int mode, int type) 
 {
   FILE *fd; //input file
   char name[100]="1CMA"; //array for input file name
   char string[600]; //the array where each line of the data file is stored when reading
   char atm[100]="ATOM";
   char begin[4];
   int index1 =0;

   fd = fopen(name, "r"); // open the input file

   if(fd==NULL) {
     printf("Error: can't open file.\n");
     return 1;
   }    

   if( type==0 ) { //pdb file type
     if( mode==0 ) { 
       while( fgets(string, 600, fd)!=NULL ) {
         printf("1 %s\n",atm);
         sscanf (string, "%4s", begin );
         printf("2 %s \n",atm);
       }
     }   
   }
   fclose(fd);
   free(fd);
   free(name);
 }

1条回答
聊天终结者
2楼-- · 2019-03-02 19:42

The string begin isn't big enough to hold the four characters that sscanf will read and its \0 terminator. If the \0 is written into atm (depending on where the strings are in memory), atm would be modified. From the sscanf manpage, about the s directive:

s    Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null byte ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.

I was able to reproduce this behavior on my machine, though the exact positioning of the strings in memory was a bit different. By printing the addresses of the strings, though, it is easy to figure exactly what's happening. Here's a minimal example:

#include<stdio.h>

int main() { 
  char begin[2];
  char atm[100]="ATOM";

  printf("begin:    %p\n", begin);
  printf("begin+16: %p\n", begin+16);
  printf("atom:     %p\n", atm);
  printf("1 %s\n",atm);
  sscanf("AAAABBBBCCCCDDDD", "%16s", begin);
  printf("2 %s \n",atm);
  return 0;
}

This produces the output:

$ ./a.out 
begin:    0x7fffffffe120
begin+16: 0x7fffffffe130
atom:     0x7fffffffe130
1 ATOM
2  

I printed the values of the pointers to figure out how big a string it would take to overflow into atm. Since (on my machine) atom begins at begin+16, reading sixteen characters into begin puts a null terminator at begin+16, which is the first character of atm, so now atm has length 0.

查看更多
登录 后发表回答