HTTP protocol: end of a message body

2020-06-27 01:47发布

问题:

I built a program that parses the header and I would like to read the message body in case I receive a POST.

For headers, I have been able to look for to determine when the header ends. I am having more issues for the message body. Am I supposed to look at "Content-Length" field to know when to stop reading input? In my current code (below), it will not stop until I hit the red cross (stop loading page) in Firefox.

Here is the code:

size_t n;
unsigned char newChar;

int index = 0;
int capacity = 50;
char *option = (char *) malloc(sizeof(char) * capacity); 

while ( ( n = read( req->socket, &newChar, sizeof(newChar) ) ) > 0 ) {
  if (newChar == '\0' || newChar == '\n') break; // This is not working

  if (index == capacity) {
    capacity *= 2;
    option = (char *) realloc(option, sizeof(char) * capacity);
    assert(option != NULL);
  }
  option[index++] = newChar;
  fprintf(stderr, "%c", newChar);
}

if (index == capacity) {
  capacity *= 2;
  option = (char *) realloc(option, sizeof(char) * capacity);
  assert(option != NULL);
}
option[index] = '\0';

The correct input gets printed, but I wonder why it won't stop until the stop loading button get pressed. I'd like to know if there is any other solution or if I please need to use the "Content-Length" field in the header.

Thank you very much,

Jary

回答1:

There are a few things to consider. You'll want to consider how you want to handle all of these cases perhaps?

  • For HTTP protocol 1.0 the connection closing was used to signal the end of data.

  • This was improved in HTTP 1.1 which supports persistant connections. For HTTP 1.1 typically you set or read the Content-Length header to know how much data to expect.

  • Finally with HTTP 1.1 there is also the possibility of "Chunked" mode, you get the size as they come and you know you've reached the end when a chunk Size == 0 is found.

Also do you know about libcurl? It will certainly help you having to re-implement the wheel.



回答2:

This code blocks on the read() waiting for another character which never comes.

Additionally, RFC2616, 3.7.1 states "HTTP applications MUST accept CRLF, bare CR, and bare LF as being representative of a line break in text media received via HTTP. In addition, if the text is represented in a character set that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use of whatever octet sequences are defined by that character set to represent the equivalent of CR and LF for line breaks."

So you're going to need to catch more than just "\n".



标签: c http