The following function gets file offsets from the rabin_polynomial structure, opens the input_file for md5 fingerprint generation and writes the result to fpfile
My problem is it seems to use the same chunk_buffer content some times that it generates similar fingerprints for chunks with different legths.
What could be the reason?
I have tested the md5 function with other inputs separately and it generates correct digests.
int write_rabin_fingerprints_to_binary_file(FILE *fpfile,FILE *input_file,
struct rabin_polynomial *head)
{
struct rabin_polynomial *poly=head;
unsigned char fing_print[33]={'\0'};
size_t bytes_read;
while(poly != NULL)
{
char *chunk_buffer;
chunk_buffer = (char*) malloc ((poly->length));
bytes_read=fread (chunk_buffer,1, poly->length,input_file);
if(bytes_read!=poly->length)
{
printf("Error reading from%s ",input_file);
return -1;
}
strncpy((char*)fing_print,md5(chunk_buffer).c_str(),32);
size_t ret_val=fprintf(fpfile, "%llu\t%lu\t%s\n",poly->start,
poly->length,fing_print);
if(ret_val == 0)
{
fprintf(stderr, "Could not write rabin polynomials to file.");
return -1;
}
poly=poly->next_polynomial;
free(chunk_buffer);
}
return 0;
}
EDIT:
I am running this program using visual studio 2010. Could typecasting to char * in the malloc() line create the problem?
The number of bytes read is just as specified in the argument.
There was nothing wrong in the code to cause such faults. I just found out that it happened because of zero-length strings which are also called as file holes.