Why read system call stops reading when less than

2020-05-03 01:13发布

Introduction and general objective

I am trying to send an image from a child process (generated by calling popen from the parent) to the parent process.

The image is a grayscale png image. It is opened with the OpenCV library and encoded using imencode function of the same library. So the resulting encoded data is stored into a std::vector structure of type uchar, namely the buf vector in the code below.

No error in sending preliminary image information

First the child sends the following image information needed by the parent:

  • size of the buf vector containing the encoded data: this piece of information is needed so that the parent will allocate a buffer of the same size where to write the image information that it will receive from the child. Allocation is performed as follows (buf in this case is the array used to received data not the vector containing the encoded data):

    u_char *buf = (u_char*)malloc(val*sizeof(u_char));
    
  • number of rows of the original image: needed by the parent to decode the image after all data have been received;
  • number of columns of the original image: needed by the parent to decode the image after all data have been received.

These data are written by the child on the standard output using cout and read by the parent using fgets system call.

This pieces of information are correctly sent and received so no problem until now.

Sending image data

The child writes the encoded data (i.e. the data contained in the vector buf) to the standard output using write system call while the parent uses the file-descriptor returned by popen to read the data. Data is read using read system call.

Data writing and reading is performed in blocks of 4096 bytes inside while loops. The writing line is the following:

written += write(STDOUT_FILENO, buf.data()+written, s);

where STDOUT_FILENO tells to write on standard output. buf.data() returns the pointer to the first element in the array used internally by the vector structure. written stores the number of bytes that have been written until now and it is used as index. s is the number of bytes (4096) that write will try to send each time. write returns the number of bytes that actually have been written and this is used to update written.

Data reading is very similar and it is performed by the following line:

bytes_read = read(fileno(fp), buf+total_bytes, bytes2Copy);

fileno(fp) is telling from where to read data (fp is the filedescriptor returned by popen). buf is the array where received data is stored and total_bytes are the number of bytes read until now so it is used as index. bytes2Copy is the number of bytes expected to be received: it is wither BUFLEN (i.e. 4096) or for the last block of data the remaining data (if for example the total bytes are 5000 then after 1 block of 4096 bytes another block of 5000-4096 is expected).

The code

Consider this example. The following is a process launching a child process with popen

#include <stdlib.h>
#include <unistd.h>//read
#include "opencv2/opencv.hpp"
#include <iostream>
#define BUFLEN 4096

int main(int argc, char *argv[])
{
    //file descriptor to the child process
    FILE *fp;
    cv::Mat frame;
    char temp[10];
    size_t bytes_read_tihs_loop = 0;
    size_t total_bytes_read = 0;
    //launch the child process with popen
    if ((fp = popen("/path/to/child", "r")) == NULL)
    {
        //error
        return 1;
    }

    //read the number of btyes of encoded image data
    fgets(temp, 10, fp);
    //convert the string to int
    size_t bytesToRead = atoi((char*)temp);

    //allocate memory where to store encoded iamge data that will be received
    u_char *buf = (u_char*)malloc(bytesToRead*sizeof(u_char));

    //some prints
    std::cout<<bytesToRead<<std::endl;

    //initialize the number of bytes read to 0
    bytes_read_tihs_loop=0;
    int bytes2Copy;
    printf ("bytesToRead: %ld\n",bytesToRead);
    bytes2Copy = BUFLEN;
    while(total_bytes_read<bytesToRead &&
        (bytes_read_tihs_loop = read(fileno(fp), buf+total_bytes_read, bytes2Copy))
    )
    {
        //bytes to be read at this iteration: either 4096 or the remaining (bytesToRead-total)
        bytes2Copy = BUFLEN < (bytesToRead-total_bytes_read) ? BUFLEN : (bytesToRead-total_bytes_read);
        printf("%d btytes to copy\n", bytes2Copy);
        //read the bytes
        printf("%ld bytes read\n", bytes_read_tihs_loop);

        //update the number of bytes read
        total_bytes_read += bytes_read_tihs_loop;
        printf("%lu total bytes read\n\n", total_bytes_read);
    }
    printf("%lu bytes received over %lu expected\n", total_bytes_read, bytesToRead);
    printf("%lu final bytes read\n", total_bytes_read);
    pclose(fp);
    cv::namedWindow( "win", cv::WINDOW_AUTOSIZE );
    frame  = cv::imdecode(cv::Mat(1,total_bytes_read,0, buf), 0);
    cv::imshow("win", frame);

    return 0;

}

and the process opened by the above corresponds to the following:

#include <unistd.h> //STDOUT_FILENO
#include "opencv2/opencv.hpp"
#include <iostream>
using namespace std;
using namespace cv;

#define BUFLEN 4096

int main(int argc, char *argv[])
{
    Mat frame;
    std::vector<uchar> buf;
    //read image as grayscale
    frame = imread("test.png",0);
    //encode image and put data into the vector buf
    imencode(".png",frame, buf);
    //send the total size of vector to parent
    cout<<buf.size()<<endl;
    unsigned int written= 0;

    int i = 0;
    size_t toWrite = 0;
    //send until all bytes have been sent
    while (written<buf.size())
    {
        //send the current block of data
        toWrite = BUFLEN < (buf.size()-written) ? BUFLEN : (buf.size()-written);
        written += write(STDOUT_FILENO, buf.data()+written, toWrite);
        i++;
    }
    return 0;

}

The error

The child reads an image, encodes it and sends first the dimensions (size, #rows, #cols) to the parent and then the encoded image data.

The parent reads first the dimensions (no prob with that), then it starts reading data. Data is read 4096 bytes at each iteration. However when less than 4096 bytes are missing, it tries to read only the missing bytes: in my case the last step should read 1027 bytes (115715%4096), but instead of reading all of them it just reads `15.

What I got printed for the last two iterations is:

4096 btytes to copy
1034 bytes read
111626 total bytes read

111626 bytes received over 115715 expected
111626 final bytes read
OpenCV(4.0.0-pre) Error: Assertion failed (size.width>0 && size.height>0) in imshow, file /path/window.cpp, line 356
terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.0.0-pre) /path/window.cpp:356: error: (-215:Assertion failed) size.width>0 && size.height>0 in function 'imshow'

Aborted (core dumped)

Why isn't read reading all the missing bytes?

I am working on this image:enter image description here

There might be errors also on how I am trying to decode back the image so any help there would be appreciated too.

EDIT

In my opinion as opposed to some suggestions the problem is not related to the presence of \n or \r or \0.

In fact when I print data received as integer with the following lines:

for (int ii=0; ii<val; ii++)
{
    std::cout<<(int)buf[ii]<< " ";
}

I see 0, 10 and 13 values (the ASCII values of the above mentioned characters) in the middle of data so this makes me think it is not the problem.

3条回答
倾城 Initia
2楼-- · 2020-05-03 01:44

You're writing binary data to standard output, which is expecting text. Newline characters (\n) and/or return characters (\r) can be added or removed depending on your systems encoding for end-of-line in text files. Since you're missing characters, it appears that you system is removing one of those two characters.

You need to write your data to a file that you open in binary mode, and you should read in your file in binary.

查看更多
The star\"
3楼-- · 2020-05-03 01:53
fgets(temp, 10, fp);
...
read(fileno(fp), ...)

This cannot possibly work.

stdio routines are buffered. Buffers are controlled by the implementation. fgets(temp, 10, fp); will read an unknown number of bytes from the file and put it in a buffer. These bytes will never be seen by low level file IO again.

You never, ever, use the same file with both styles of IO. Either do everything with stdio, or do everything with low-level IO. The first option is the easiest by far, you just replace read with fread.

If for some ungodly reason known only to the evil forces of darkness you want to keep both styles of IO, you can try that by calling setvbuf(fp, NULL, _IOLBF, 0) before doing anything else. I have never done that and cannot vouch for this method, but they say it should work. I don't see a single reason to use it though.

On a possibly unrelated, note, your reading loop has some logic in its termination condition that is not so easy to understand and could be invalid. The normal way to read a file looks approximately as follows:

 left = data_size;
 total = 0;
 while (left > 0 &&
        (got=read(file, buf+total, min(chunk_size, left))) > 0) {
    left -= got;
    total += got;
 }

 if (got == 0) ... // reached the end of file
 else if (got < 0) ... // encountered an error

The more correct way would be to try again if got < 0 && errno == EINTR, so the modified condition could look like

 while (left > 0 &&
        (((got=read(file, buf+total, min(chunk_size, left))) > 0) ||
        (got < 0 && errno == EINTR))) {

but at this point readability starts to suffer and you may want to split this in separate statements.

查看更多
来,给爷笑一个
4楼-- · 2020-05-03 02:01

Updated Answer

I am not the world's best at C++, but this works and will give you a reasonable starting point.

parent.cpp

#include <stdlib.h>
#include <unistd.h>
#include <iostream>
#include "opencv2/opencv.hpp"


int main(int argc, char *argv[])
{
    // File descriptor to the child process
    FILE *fp;

    // Launch the child process with popen
    if ((fp = popen("./child", "r")) == NULL)
    {
        return 1;
    }

    // Read the number of bytes of encoded image data
    std::size_t filesize;
    fread(&filesize, sizeof(filesize), 1, fp);
    std::cout << "Filesize: " << filesize << std::endl;

    // Allocate memory to store encoded image data that will be received
    std::vector<uint8_t> buffer(filesize);

    int bufferoffset   = 0;
    int bytesremaining = filesize;
    while(bytesremaining>0)
    {
        std::cout << "Attempting to read: " << bytesremaining << std::endl;
        int bytesread   = fread(&buffer[bufferoffset],1,bytesremaining,fp);
        bufferoffset   += bytesread;
        bytesremaining -= bytesread;
        std::cout << "Bytesread/remaining: " << bytesread << "/" << bytesremaining << std::endl;
    }
    pclose(fp);

    // Display that image
    cv::Mat frame;
    frame = cv::imdecode(buffer, -CV_LOAD_IMAGE_ANYDEPTH);
    cv::imshow("win", frame);
    cv::waitKey(0);
}

child.cpp

#include <cstdio>
#include <cstdint>
#include <vector>
#include <fstream>
#include <cassert>
#include <iostream>

int main()
{
    std::FILE* fp = std::fopen("image.png", "rb");
    assert(fp);

    // Seek to end to get filesize
    std::fseek(fp, 0, SEEK_END);
    std::size_t filesize = std::ftell(fp);

    // Rewind to beginning, allocate buffer and slurp entire file
    std::fseek(fp, 0, SEEK_SET);
    std::vector<uint8_t> buffer(filesize);
    std::fread(buffer.data(), sizeof(uint8_t), buffer.size(), fp);
    std::fclose(fp);

    // Write filesize to stdout, followed by PNG image
    std::cout.write((const char*)&filesize,sizeof(filesize));
    std::cout.write((const char*)buffer.data(),filesize);
}

Original Answer

There are a couple of issues:

Your while loop writing the data from the child process is incorrect:

while (written<buf.size())
{
    //send the current block of data
    written += write(STDOUT_FILENO, buf.data()+written, s);
    i++;
}

Imagine your image is 4097 bytes. You will write 4096 bytes the first time through the loop and then try and write 4096 (i.e. s) bytes on the second pass when there's only 1 byte left in your buffer.

You should write whichever is the lesser of 4096 and bytes remaining in buffer.


There's no point sending the width and height of the file, they are already encoded in the PNG file you are sending.

There's no point calling imread() in the child to convert the PNG file from disk into a cv::Mat and then calling imencode() to convert it back into a PNG to send to the parent. Just open() and read the file as binary and send that - it is already a PNG file.


I think you need to be clear in your mind whether you are sending a PNG file or pure pixel data. A PNG file will have:

  • PNG header,
  • image width and height,
  • date of creation,
  • color type, bit-depth
  • compressed, checksummed pixel data

A pixel-data only file will have:

  • RGB, RGB, RGB, RGB
查看更多
登录 后发表回答