Can a pipe in Linux ever lose data?

2020-06-08 17:47发布

问题:

And is there an upper limit on how much data it can contain?

回答1:

Barring a machine crash, no it can't lose data. It's easy to misuse it and think you're losing data however, either because a write failed to write all the data you requested and you didn't check the return value or you did something wrong with the read.

The maximum amount of data it can hold is system dependent -- if you try to write more than that, you'll either get a short write or the writer will block until space is available. The pipe(7) man page contains lots of useful info about pipes, including (on Linux at least) how big the buffer is. Linux has buffers of 4K or 64K depending on version.

edit

Tim mentions SIGPIPE, which is also a potential issue that can seem to lose data. If the reader closes the pipe before reading everything in it, the unread data will be thrown away and the writer will get a SIGPIPE signal when they write more or close the pipe, indicating that this has occurred. If they block or ignore the SIGPIPE, they'll get an EPIPE error. This covers the situation Paul mentioned.

PIPE_BUF is a constant that tells you the limit of atomic writes to the buffer. Any write this size or smaller will either succeed completely or block until it can succeed completely (or give EWOULDBLOCK/EAGAIN if the pipe is in non-blocking mode). It has no relation to the actual size of the kernel's pipe buffer, though obviously the buffer must be at least PIPE_BUF in size to meet the atomicity guarentee.



回答2:

Data can be lost in a pipe when the following happens:

  1. A process (the writer) writes n bytes of data to the pipe, where n≤PIPE_BUF. This write is guaranteed to be atomic and will never block.
  2. A process (the reader) reads only m<n bytes of data and exits.
  3. The writer doesn’t attempt to write to the pipe again.

As a result, the kernel pipe buffer will contain n-m bytes which will be lost when all handles to the pipe have been closed. The writer will not see SIGPIPE or EPIPE since it never attempts to write to the pipe again. Since the writer won’t ever learn that the pipe contains leftover data that will simply disappear, one can consider this data lost.

A non-standard way of detecting this would be for the writer to define a timeout and call the FIONREAD ioctl to determine the number of bytes left in the pipe buffer.



回答3:

If you are referring to using the | operator in the shell then no, it will not lose data. It merely connects the app on the left side's standard output stream to the app on the right side's standard input stream. If you are piping data between apps and aren't getting the results you expect, try using > to redirect standard output from the first app to a file and then use < to use that file as standard input for the second app. That way, you can inspect the file amd make sure the data is being sent in the format you expect.

If you mean a pipe created by the pipe function then the answer is still no. According to this man page, writing to a full pipe will block until enough data has been read to make room for the write data. It also states that the size of a pipe is 4KB in Linux pre-2.6.11, and is 64kB on 2.6.11 and later.



回答4:

Your pipe isn't losing data. If you're losing data in your application, try debugging it with gdb. A couple of things to look for:
1) Is your buffer large enough to hold all the data that you're reading?
2) Check the return codes from your read() on the pipe for errors.
3) Are you sure you're writing all of the data to the pipe?
4) Is your write/read operation getting interrupted by a signal? ie: SIGPIPE?



回答5:

The reason you will not lose data is that when the buffer associated with the pipe fills up a call to write will block until the reader has emptied the buffer enough for the operation to complete. (You can do non-blocking writes as well, but then you are responsible for making sure you complete any writes that would have blocked.)



标签: linux posix pipe