It all started with this question -> How to read blocks of data from a file and then read from that block into a vector?
With the aim of minimizing disk I/O operations, I performed a few experiments to see if size of buffer has any kind of effect on the time taken by program.
I used the following two codes, one more c-oriented and another more c++ (though both compiled with gcc):-
The c oriented code:-
int buffer_size=1024;
FILE *file;
file = fopen(argv[1], "r");
FILE *out_file;
out_file = fopen("in", "w");
char out_buffer[2048];
setvbuf(out_file, out_buffer, _IOFBF, buffer_size);
char buffer[2048];
setvbuf(file, buffer, _IOFBF, buffer_size);
while (!feof(file))
{
char sl[1000];
fgets(sl, 140 , file);
fputs(sl, out_file);
}
The c code gave the following results (for a 14 mb file):-
Buffer_size Time
10 18 sec
100 2 sec
1024 0.4 sec
10240 0.3 sec
(for a 103 mb file)
1024 ~8 sec
5120 ~3 sec
10240 ~3 sec
15360 ~3 sec
It seems to reach a saturation point at buffer size of about 5 mb. Any particular reason for this?
The c++ oriented code:-
int buffer_size=1024;
ifstream in_file(argv[1]);
char in_buffer[buffer_size];
in_file.rdbuf()->pubsetbuf(in_buffer,sizeof(in_buffer));
ofstream out_file("in");
char out_buffer[buffer_size];
out_file.rdbuf()->pubsetbuf(out_buffer,sizeof(in_buffer));
while(!in_file.eof())
{
char sl[1024];
in_file >> sl;
out_file << sl<<endl;
}
My test input file was a 14mb file with 1000000 lines.
Buffer_size Time (~)
10 6.5 sec
100 6.5 sec
1024 6.5 sec
C++ does not seem to care about the buffer size at all. Why?
Also, the C++ code is about 15 times slower (when the buffer size in C is 1 mb)! Is ifstream usually slower than FILE (other answers on SO seem to suggest that there is no difference)? Or is there something else in the code that is causing the slowness?