Consider the following simple C program that read a file into a buffer and displays that buffer to the console:
#include<stdio.h>
main()
{
FILE *file;
char *buffer;
unsigned long fileLen;
//Open file
file = fopen("HelloWorld.txt", "rb");
if (!file)
{
fprintf(stderr, "Unable to open file %s", "HelloWorld.txt");
return;
}
//Get file length
fseek(file, 0, SEEK_END);
fileLen=ftell(file);
fseek(file, 0, SEEK_SET);
//Allocate memory
buffer=(char *)malloc(fileLen+1);
if (!buffer)
{
fprintf(stderr, "Memory error!");
fclose(file);
return;
}
//Read file contents into buffer
fread(buffer, fileLen, 1, file);
//Send buffer contents to stdout
printf("%s\n",buffer);
fclose(file);
}
The file it will read simply contains:
Hello World!
The output is:
Hello World!²²²²▌▌▌▌▌▌▌↔☺
It has been a while since I did anything significant in C/C++, but normally I would assume the buffer was being allocated larger than necessary, but this does not appear to be the case.
fileLen ends up being 12, which is accurate.
I am thinking now that I must just be displaying the buffer wrong, but I am not sure what I am doing wrong.
Can anyone clue me in to what I am doing wrong?
JesperE is correct regarding the nul-termination issue in your example, I'll just add that if you are processing text files it would be better to use fgets() or something similar as this will properly handle newline sequences across different platforms and will always nul-terminate the string for you. If you are really working with binary data then you don't want to use printf() to output the data as the printf functions expect strings and a nul byte in the data will cause truncation of the output.
JesperE's approach will work, but you may be interested to know that there's an alternate way of handling this.
You can always print a string of known length, even when there's no NUL-terminator, by providing the length to
printf
as the precision for the string field:This allows you print the string without modifying the buffer.
You can use
calloc
instead ofmalloc
to allocate memory that is already initialised.calloc
takes on extra argument. It's useful for allocating arrays; the first parameter ofcalloc
indicates the number of elements in the array that you would like to allocate memory for, and the second argument is the size of each element. Since the size of achar
is always 1, we can just pass1
as the second argument:In C, there is no need to cast the return value of
malloc
orcalloc
. The above will ensure that the string will be null terminated even if the reading of file ended prematurely for whatever reason.calloc
does take longer thanmalloc
because it has to zero out all the memory you asked for before giving it to you.You need to NUL-terminate your string. Add
before printing it.
Your approach to determine file size by seeking to the end of the file and then using
ftell()
is wrong:"b"
in the second parameter to thefopen()
call, thenftell()
may not tell you the number of characters that you can read from the file. For example, windows uses two bytes for end of line, but when read, it is onechar
. In fact, the return value offtell()
for streams opened in text mode is useful only in calls tofseek()
, and not to determine file size."b"
in the second parameter tofopen()
, then the C standard has this to say:So, what you are doing isn't necessarily going to work in standard C. Your best bet is to use
fread()
to read, and if you happen to need more memory, userealloc()
. Your system may providemmap()
, or may make guarantees about setting the file position indicator to end-of-file for binary streams—but relying on those is not portable.See also this C-FAQ: What's the difference between text and binary I/O?.