Corrupted output with C++, cin, cout, threads and

2020-05-18 02:10发布

问题:

I am trying to make a program in C++ to process a lot of packets in the fastest way possible. All the packets come from the standard should be read as fast as possible, sent to one thread from a pool to do the processing and then handled to an output thread that will write the packet to the standard output.

When you are using the standard input and output in C++, it's recommended that before any input or output you call to the std::ios_base::sync_with_stdio(false) function. In some environments this achieves a great speedup, although you should avoid using standard C functions for input/output after the call.

Well, this seems to work perfectly in a single thread. But as I have said my intention is using one thread for input, one for output and multiple threads for parallel processing. I've observed some problems with the output. This is the output thread (very simplified):

void PacketDispatcher::thread_process_output(OutputQueue& output_queue) {
    std::vector<Packet> packet_list;
    while(output_queue.get(packet_list)) {
        for (const auto& packet: packet_list) {
            std::cout << "Packet id = " << packet.id << "\n";
        }
    }
    std::cout.flush();
}

If I used std::endl instead of "\n" there were less corruption, but std::endl forces a flush of the stream, affecting performance in this case (and the problem wasn't solved, only minimized).

That's the only point in the program using std::cout, but if I make the call to std::ios_base::sync_with_stdio(false) at the beggining of the program I get a noticeable speedup, but my output is corrupted always in some way:

Packet id = Packet id = 4
Packet id = 5
Packet id = 6
Packet id = 7
Packet id = 8
Packet id = 9
Packet id = 10

So, where is the problem? Isn't C++ able to do multithreading using fast standard input/output?

回答1:

I finally found the culprit. If you search for Internet a lot of sites recommends using the sync_with_stdio call, but they don't talk about threads.

Other sites talk about iostreams and threads, like this one, but that doesn't explain why I was getting corrupted output when I was using std::cin in only one thread, and std::cout in its own thread too.

The problem is that internally, the std::cin input thread was calling to std::cout to flush its buffer, but as the streams where not synchronized with mutex or something similar, the output was corrupted. Why should I synchronized the buffers if they are doing different things? Why std::cin was messing with std::cout?

In C++, by default, the standard streams cin, cerr and clog are tied to cout. What does this mean? It means that when you try to read from cin, first it will force a flush to cout. Sometimes this is something useful as you can read here.

But in my case, this was causing some serious issues, so, how to untie the streams?. It's very easy using the tie method:

std::ios_base::sync_with_stdio(false);

std::cin.tie(nullptr);
std::cerr.tie(nullptr);

Or if your compiler doesn't support C++11:

std::ios_base::sync_with_stdio(false);

std::cin.tie(static_cast<ostream*>(0));
std::cerr.tie(static_cast<ostream*>(0));

With this changes my output it's now correct:

Packet id = 1
Packet id = 2
Packet id = 3
Packet id = 4
Packet id = 5
Packet id = 6
Packet id = 7
Packet id = 8
Packet id = 9
Packet id = 10

And as it avoids doing a flush every time std::cin is used, it's faster too :-)