parse a subset of HTTP header to identify host web

2019-09-19 06:24发布

问题:

The HTTP protocol is all contained in the data portion of TCP packets.

As an assignment I need to parse HTTP headerfields (for purposes of this question lets only consider host web address) only using string parsing functions, I cannot use any existing libraries to do that. I tried to find a bitwise segmentation of HTTP header, but failed. I really do not know what to do now. Any suggestions?

Thank you beforehand

What I have already done is extracted Ethernet, IP and TCP header information and have parsed data in hexadecimal form, i.e.

Data

    48 54 54 50 2F 31 2E 31 20 33 30 34 20 4E 6F 74         HTTP/1.1 304 Not
    20 4D 6F 64 69 66 69 65 64 0D 0A 58 2D 43 6F 6E          Modified..X-Con
    74 65 6E 74 2D 54 79 70 65 2D 4F 70 74 69 6F 6E         tent-Type-Option
    73 3A 20 6E 6F 73 6E 69 66 66 0D 0A 44 61 74 65         s: nosniff..Date
    3A 20 54 68 75 2C 20 30 31 20 44 65 63 20 32 30         : Thu, 01 Dec 20
    31 31 20 31 33 3A 31 36 3A 34 30 20 47 4D 54 0D         11 13:16:40 GMT.
    0A 53 65 72 76 65 72 3A 20 73 66 66 65 0D 0A 58         .Server: sffe..X
    2D 58 53 53 2D 50 72 6F 74 65 63 74 69 6F 6E 3A         -XSS-Protection:
    20 31 3B 20 6D 6F 64 65 3D 62 6C 6F 63 6B 0D 0A          1; mode=block..
    0D 0A                     
                          ..

MeNa has showed the trick to separate HTTP header fields. But to do that I need to convert my data payload to string. I tried to do the following way:

unsigned char * data;
data = (unsigned char *)(packet + ETHERNET_HEADER_SIZE + IP_HEADER_SIZE + TCP_HEADER_SIZE);
int length = header_length - (ETHERNET_HEADER_SIZE + IP_HEADER_SIZE + TCP_HEADER_SIZE);

char string[length];
for (i = 0; i < length; i++) {  
    string[i] = (char)data[i];
}
printf("%s ", string);

this prints out a string, but mostly of little squares, rather then charachters :(

回答1:

I think there is no bitwise segmentation of HTTP header. in the HTTP, each header ends with "\r\n". so, all you need is to look for the next "\r\n" and pick it out.

Something like:

char httpRe[] ="GET / HTTP/1.1\r\nHost: http://stackoverflow.com/\r\nReferer: https://www.google.com/\r\n\r\n";
char * parser = strtok (httpRe,"\r\n");
while (parser != NULL){
   printf ("%s\n",parser);
   parser = strtok (NULL, "\r\n");
}

This what you are looking for?