Download file, winsock recv() to fstream write, fi

2019-09-04 11:49发布

Im trying to download a file from my website using winsock. i faced countless problems and now im able to download the file, but its corrupted.

It doesnt work with any file extension. Text and pictures end up corrupted, audio files too. With binary files i can see this error upon execution "program too big to fit in memory".

First i send() a Head request to the server to know the content-leght (size of file to download), then i send a Get request and i recv into a buffer. After recv is done i write the file.

I tried to write a simple example of code here, i tried various loop approaches, but at the end i still have a corrupted file written to disk. the size is the same (50kb file on the server, 50kb file downloaded and written on disk). Thank you all.

headrequest = "HEAD " + "/folder/file.asd" + " HTTP/1.1\r\nHost: " + "url.com" + "\r\n\r\n";
getrequest = "GET " + "/folder/file.asd" + " HTTP/1.1\r\nHost: " + "url.com" + "\r\n\r\n";

send(socket, headrequest, sizeof(headrequest), 0);
recv(socket, reply_buf_headrequest, sizeof(reply_buf_headrequest), 0); 
//two functions to get the header end and "Content-Lenght" data from header

send(socket, getrequest, sizeof(getrequest), 0);
while(1)
{    
 recv(socket, recvbuff, sizeof(recvbuff), 0);
 if (recv(socket, recvbuff, sizeof(recvbuff), 0) == 0) 
  break; 
}
out.write(recvbuff, content_lenght); // also tried --> out.write(recvbuff + header_end, content_lenght) //same errors.
out.close();

I screw up with the buffer/position to start reading/writing or something like that. I thought using recvbuff + header_end would work, since it would start reading from the end of the header to get the file. This is confusing. I hope one kind soul could help me figure out how to handle this situation and write correctly the file bytes. :)

Edit:

i dint thought that i was overwriting data like that. damn. content_length comes from the previous HEAD request, a function reads the recv'ed data and finds the "Content-Length" value, which is the size in bytes of /folder/file.asd. i couldnt manage to get it in the Get request, so i did it like this.. the filesize it gets is correct.

so,

while(1)
{
  if (recv(socket, recvbuff, sizeof(recvbuff), 0) == 0)
   break;
}
out.write(recvbuff, content_lenght);
out.close();

out.write should after the loop or inside the while(1) loop?

Thanks for the fast reply. :)

I omitted the error checking part to keep the example code short, sorry. the head and get request are chars, i tried with strings too and ended up not using sizeof() for that. i cant access the real code until tomorrow, so im trying to fix it at home using a similar snippet..there are some typos probably..

Edit 2: as test with a small exe that just spawns a messagebox im using a buffer bigger than the file and this:

ofstream out("test.exe", ios::binary);

and using this loop now:

    int res;   // return code to monitor transfer
do {    
    res = recv(socket, recvbuff, sizeof(recvbuff), 0);   // look at return code
    if (res > 0)  // if bytes received 
        out.write(recvbuff, res ); // write them  
} while (res>0);   // loop as long as we receive something  
if (res==SOCKET_ERROR)  
    cerr << "Error: " << WSAGetLastError() << endl; 

still having "program too big to fit in memory" error upon execution..

1条回答
Rolldiameter
2楼-- · 2019-09-04 12:19

That's normal ! Your code doesn't really take care of the content you receive !

See my comments:

while(1)  // Your original (indented) code commented: 
{    
    recv(socket, recvbuff, sizeof(recvbuff), 0);  // You read data in buffer 
    if (recv(socket, recvbuff, sizeof(recvbuff), 0) == 0)  // you read again, overwriting data you've received !! 
        break; 
}
out.write(recvbuff, content_lenght); // You only write the last thing you've received. 
                            // Where does the lengthe come from ?  Maybe you have buffer overflow as well.

Rewrite your loop as follows:

int res;   // return code to monitor transfer
do {    
    res = recv(socket, recvbuff, sizeof(recvbuff), 0);   // look at return code
    if (res > 0)  // if bytes received 
        out.write(recvbuff, res ); // write them  
} while (res>0);   // loop as long as we receive something  
if (res==SOCKET_ERROR)  
    cerr << "Error: " << WSAGetLastError() << endl; 

The advantage is that you don't have to care for overall size, as you write each small chunk that you receive.

Edit:

Following our exchange of comment, here some additional information. As someone pointed out, HTTP protocol is somewhat more complex to manage. See here, in chapter 6 for additional details about the format of a response, and the header you have to skip.

Here some updated proof of concept to skip the header:

ofstream out;
out.open(filename, ios::binary);
bool header_skipped=false;  // was header skiped (do it only once !!) 
int res;   // return code to monitor transfer
do {
    res = recv(mysocket, recvbuff, sizeof(recvbuff), 0);   // look at return code
    if (res > 0)     // if bytes received
    {
        size_t data_offset = 0;      // normally take data from begin of butter 
        if (!header_skipped) {    // if header was not skipped, look for its end
            char *eoh = "\r\n\r\n";
            auto it = search (recvbuff, recvbuff + res, eoh, eoh + 4); 
            if (it != recvbuff + res) {   // if header end found: 
                data_offset = it - recvbuff + 4;      // skip it
                header_skipped = true;              // and then do not care any longer
            }                             // because data can also containt \r\n\r\n
        }
        out.write(recvbuff + data_offset, res - data_offset); // write, ignoring before the offset
    }
} while (res > 0);   // loop as long as we receive something  
if (res == SOCKET_ERROR) cerr << "Error: " << WSAGetLastError() << endl;
out.close();

Attention ! As said, it's a proof of concept. It will probably work. However, be aware that you cannot be sure how the data will be regrouped at receiver side. It is perfectly well possibly that the end of header is split between two successive reads (e.g. \r as last byte of one recv() and \n\r\n as first bytes of next recv()). In such a case this simple code won't find it. So it's not yet production quality code. Up to you to improve further

查看更多
登录 后发表回答