Why is this C code faster than this C++ code ? get

2020-02-17 09:40发布

问题:

I have two versions of a program that does basically the same thing, getting the biggest length of a line in a file, I have a file with about 8 thousand lines, my code in C is a little bit more primitive (of course!) than the code I have in C++. The C programm takes about 2 seconds to run, while the program in C++ takes 10 seconds to run (same file I am testing with for both cases). But why? I was expecting it to take the same amount of time or a little bit more but not 8 seconds slower!

my code in C:

#include <stdio.h>
#include <stdlib.h> 
#include <string.h>

#if _DEBUG
    #define DEBUG_PATH "../Debug/"
#else
    #define DEBUG_PATH ""
#endif

const char FILE_NAME[] = DEBUG_PATH "data.noun";

int main()
{   
    int sPos = 0;
    int maxCount = 0;
    int cPos = 0;
    int ch;
    FILE *in_file;              

    in_file = fopen(FILE_NAME, "r");
    if (in_file == NULL) 
    {
        printf("Cannot open %s\n", FILE_NAME);
        exit(8);
    }       

    while (1) 
    {
        ch = fgetc(in_file);
        if(ch == 0x0A || ch == EOF) // \n or \r or \r\n or end of file
        {           
            if ((cPos - sPos) > maxCount)
                maxCount = (cPos - sPos);

            if(ch == EOF)
                break;

            sPos = cPos;
        }
        else
            cPos++;
    }

    fclose(in_file);

    printf("Max line length: %i\n",  maxCount); 

    getch();
    return (0);
}

my code in C++:

#include <iostream>
#include <fstream>
#include <stdio.h>
#include <string>

using namespace std;

#ifdef _DEBUG
    #define FILE_PATH "../Debug/data.noun"
#else
    #define FILE_PATH "data.noun"
#endif

int main()
{
    string fileName = FILE_PATH;
    string s = "";
    ifstream file;
    int size = 0;

    file.open(fileName.c_str());
    if(!file)
    {
        printf("could not open file!");
        return 0;
    }

    while(getline(file, s) )
            size = (s.length() > size) ? s.length() : size;
    file.close();

    printf("biggest line in file: %i", size);   

    getchar();
    return 0;
}

回答1:

The C++ version constantly allocates and deallocates instances of std::string. Memory allocation is a costly operation. In addition to that the constructors/destructors are executed.

The C version however uses constant memory, and just does was necessary: Reading in single characters, setting the line-length counter to the new value if higher, for each newline and that's it.



回答2:

My guess is that it is a problem with the compiler options you are using, the compiler itself, or the file system. I just now compiled both versions (with optimizations on) and ran them against a 92,000 line text file:

c++ version:  113 ms
c version:    179 ms

And I suspect that the reason that the C++ version is faster is because fgetc is most likely slower. fgetc does use buffered I/O, but it is making a function call to retrieve every character. I've tested it before and fgetc is not as fast as making a call to read the entire line in one call (e.g., compared to fgets).



回答3:

So in a few comments I echoed peoples' answers that the problem was likely the extra copying done by your C++ version, where it copies the lines into memory in a string. But I wanted to test that.

First I implemented the fgetc and getline versions and timed them. I confirmed that in debug mode the getline version is slower, about 130 µs vs 60 µs for the fgetc version. This is unsurprising given conventional wisdom that iostreams are slower than using stdio. However in the past it's been my experience that iostreams get a significant speed up from optimization. This was confirmed when I compared my release mode times: about 20 µs using getline and 48 µs with fgetc.

The fact that using getline with iostreams is faster than fgetc, at least in release mode, runs counter to the reasoning that copying all that data must be slower than not copying it, so I'm not sure what all optimization is able to avoid, and I didn't really look to find any explanation, but it'd be interesting to understand what's being optimized away. edit: when I looked at the programs with a profiler it wasn't obvious how to compare the performance since the different methods looked so different from each other

Anwyay I wanted to see if I could get a faster version by avoiding the copying using the get() method on the fstream object and just do exactly what the C version is doing. When I did this I was quite surprised to find that using fstream::get() was quite a bit slower than both the fgetc and getline methods in both debug and release; About 230 µs in debug, and 80 µs in Release.

To narrow down whatever the slow-down is I went ahead and and did another version, this time using the stream_buf attached to the fstream object, and snextc() method on that. This version is by far the fastest; 25 µs in debug and 6 µs in release.

I'm guessing that the thing that makes the fstream::get() method so much slower is that it constructs a sentry objects for every call. Though I haven't tested this, I can't see that get() does much beyond just getting the next character from the stream_buf, except for these sentry objects.

Anyway, the moral of the story is that if you want fast io you're probably best off using high level iostream functions rather than stdio, and for really fast io access the underlying stream_buf. edit: actually this moral may only apply to MSVC, see update at bottom for results from a different toolchain.

For reference:

I used VS2010 and chrono from boost 1.47 for timing. I built 32-bit binaries (seems required by boost chrono because it can't seem to find a 64 bit version of that lib). I didn't tweak the compile options but they may not be completely standard since I did this in a scratch vs project I keep around.

The file I tested with was the 1.1 MB 20,000 line plain text version of Oeuvres Complètes de Frédéric Bastiat, tome 1 by Frédéric Bastiat from Project Gutenberg, http://www.gutenberg.org/ebooks/35390

Release mode times

fgetc time is: 48150 microseconds
snextc time is: 6019 microseconds
get time is: 79600 microseconds
getline time is: 19881 microseconds

Debug mode times:

fgetc time is: 59593 microseconds
snextc time is: 24915 microseconds
get time is: 228643 microseconds
getline time is: 130807 microseconds

Here's my fgetc() version:

{
    auto begin = boost::chrono::high_resolution_clock::now();
    FILE *cin = fopen("D:/bames/automata/pg35390.txt","rb");
    assert(cin);
    unsigned maxLength = 0;
    unsigned i = 0;
    int ch;
    while(1) {
        ch = fgetc(cin);
        if(ch == 0x0A || ch == EOF) {
            maxLength = std::max(i,maxLength);
            i = 0;
            if(ch==EOF)
                break;
        } else {
            ++i;
        }
    }
    fclose(cin);
    auto end = boost::chrono::high_resolution_clock::now();
    std::cout << "max line is: " << maxLength << '\n';
    std::cout << "fgetc time is: " << boost::chrono::duration_cast<boost::chrono::microseconds>(end-begin) << '\n';
}

Here's my getline() version:

{
    auto begin = boost::chrono::high_resolution_clock::now();
    std::ifstream fin("D:/bames/automata/pg35390.txt",std::ios::binary);
    unsigned maxLength = 0;
    std::string line;
    while(std::getline(fin,line)) {
        maxLength = std::max(line.size(),maxLength);
    }
    auto end = boost::chrono::high_resolution_clock::now();
    std::cout << "max line is: " << maxLength << '\n';
    std::cout << "getline time is: " << boost::chrono::duration_cast<boost::chrono::microseconds>(end-begin) << '\n';
}

the fstream::get() version

{
    auto begin = boost::chrono::high_resolution_clock::now();
    std::ifstream fin("D:/bames/automata/pg35390.txt",std::ios::binary);
    unsigned maxLength = 0;
    unsigned i = 0;
    while(1) {
        int ch = fin.get();
        if(fin.good() && ch == 0x0A || fin.eof()) {
            maxLength = std::max(i,maxLength);
            i = 0;
            if(fin.eof())
                break;
        } else {
            ++i;
        }
    }
    auto end = boost::chrono::high_resolution_clock::now();
    std::cout << "max line is: " << maxLength << '\n';
    std::cout << "get time is: " << boost::chrono::duration_cast<boost::chrono::microseconds>(end-begin) << '\n';
}

and the snextc() version

{
    auto begin = boost::chrono::high_resolution_clock::now();
    std::ifstream fin("D:/bames/automata/pg35390.txt",std::ios::binary);
    std::filebuf &buf = *fin.rdbuf();
    unsigned maxLength = 0;
    unsigned i = 0;
    while(1) {
        int ch = buf.snextc();
        if(ch == 0x0A || ch == std::char_traits<char>::eof()) {
            maxLength = std::max(i,maxLength);
            i = 0;
            if(ch == std::char_traits<char>::eof())
                break;
        } else {
            ++i;
        }
    }
    auto end = boost::chrono::high_resolution_clock::now();
    std::cout << "max line is: " << maxLength << '\n';
    std::cout << "snextc time is: " << boost::chrono::duration_cast<boost::chrono::microseconds>(end-begin) << '\n';
}

update:

I reran the tests using clang (trunk) on OS X with libc++. The results for the iostream based implementations stayed relatively the same (with optimization turned on); fstream::get() much slower than std::getline() much slower than filebuf::snextc(). But the performance of fgetc() improved relative to the getline() implementation and became faster. Perhaps this is because the copying done by getline() becomes an issue with this toolchain whereas it wasn't with MSVC? Maybe Microsoft's CRT implementation of fgetc() is bad or something?

Anyway, here are the times (I used a much larger file, 5.3 MB):

using -Os

fgetc time is: 39004 microseconds
snextc time is: 19374 microseconds
get time is: 145233 microseconds
getline time is: 67316 microseconds

using -O0

fgetc time is: 44061 microseconds
snextc time is: 92894 microseconds
get time is: 184967 microseconds
getline time is: 209529 microseconds

-O2

fgetc time is: 39356 microseconds
snextc time is: 21324 microseconds
get time is: 149048 microseconds
getline time is: 63983 microseconds

-O3

fgetc time is: 37527 microseconds
snextc time is: 22863 microseconds
get time is: 145176 microseconds
getline time is: 67899 microseconds


回答4:

You are not comparing apples to apples. Your C program does no copying of data from FILE* buffer into your program's memory. It also operates on raw files.

Your C++ program needs to traverse the length of each string several times - once in the stream code to know when to terminate the string that it returns to you, once in the constructor of std::string, and once in your code's call to s.length().

It is possible that you could improve the performance of your C program, for example by using getc_unlocked if it is available to you. But the biggest win comes from not having to copy your data.

EDIT: edited in response to a comment by bames53



回答5:

2 seconds for just 8.000 lines? I don't know how long your lines are, but the chances are that you are doing something very wrong.

This trivial Python program executes almost instantly with El Quijote downloaded from Project Gutenberg (40006 lines, 2.2MB):

import sys
print max(len(s) for s in sys.stdin)

The timing:

~/test$ time python maxlen.py < pg996.txt
76

real    0m0.034s
user    0m0.020s
sys     0m0.010s

You could improve your C code by buffering the input rather than reading char by char.

About why is the C++ slower than C, it should be related with building the string objects and then calling the length method. In C you are just counting the chars as you go.



回答6:

I tried compiling and running your programs against 40K lines of C++ source and they both completed in about 25ms or so. I can only conclude that your input files have extremely long lines, possibly 10K-100K characters per line. In that case the C version doesn't have any negative performance from the long line length while the C++ version would have to keep increasing the size of the string and copying the old data into the new buffer. If it had to increase in size a sufficient number of times that could account for the excessive performance difference.

The key here is that the two programs don't do the same thing so you can't really compare their results. If you were able to provide the input file we might be able to provide additional details.

You could probably use tellg and ignore to do this faster in C++.



回答7:

The C++ program builds string objects of the lines, while the C program just reads characters and looks at the characters.

EDIT:

Thanks for the upvotes, but after the discussion I now think this answer is wrong. It was a reasonable first guess, but in this case it seems that the different (and very slow) execution times are caused by other things.



回答8:

I'm alright with the theory folks. But let's get empirical.

I generated a file with 13 million lines of text file to work with.

~$ for i in {0..1000}; do cat /etc/* | strings; done &> huge.txt

The original code edited to read from stdin (shouldn't affect too much the performance) made it in almost 2 min.

C++ code:

#include <iostream>
#include <stdio.h>

using namespace std;

int main(void)
{
    string s = "";
    int size = 0;

    while (cin) {
        getline(cin, s);
        size = (s.length() > size) ? s.length() : size;
    }
    printf("Biggest line in file: %i\n", size);

    return 0;
}

C++ time:

~$ time ./cplusplus < huge.txt
real    1m53.122s
user    1m29.254s
sys     0m0.544s

A 'C' version:

#include <stdio.h>
int main(void)
{
    char *line = NULL;
    int read, max = 0, len = 0;

    while ((read = getline(&line, &len, stdin)) != -1)
        if (max < read)
            max = read -1;
    printf("Biggest line in file %d\n", max);

    return 0;
}

C performance:

~$ time ./ansic < huge.txt
real    0m4.015s
user    0m3.432s
sys     0m0.328s

Do your own math...