PHP and HTTP Header Line Breaks: What character us

2019-07-29 23:10发布

I'm looping through each line of a series of CURL returned http headers, trying to detect when one ends and the next begins. I know that an http header terminates with an empty line, but what character is used to represent this line break in php? I've tried with \n but it doesn't seem to work. I certainly could be doing something wrong.

What character is used to represent the line break used to terminate a header?

Here's my existing code:

$redirect = '';
$regs = '';
foreach ($curl_response as $line)
{   
    if ($line != "\n")
    {   # line is not a linebreak, so we're still processing a header block

        if (preg_match("(HTTP/[0-9]\.[0-9] [0-9]{3} .*)",$line))
        {   # line is the status code
            # highlight the outputted line
            $output .= "<b style='background: yellow;'>$line</b>";
        }

        elseif (preg_match("/^Location: (.*)$/m",$line,$regs)) 
        {   # the line is a location header, so grab the location being redirected to
            # highlight the outputted line
            $output .= "<b style='background: purple; color: white;'>$line</b>";
            $redirect = $regs[1];
        }

        else 
        {   # some other header, record to output
            $output .= $line;
        }

    }

    else 
    {   # we've reached a line break, so we're getting to a new block of redirects
        $output .= "\nreached line break\n";
        if ($redirect != '')
        {   # if we recorded a redirect above, append it to output
            $output .= "\n\nRedirecting to $redirect\n\n";
            $redirect = '';
        }

    }   

}

echo $output;

Solved - Turns out that \r is what I should have been matching on. Very odd. Not sure if this changes per site, or if it's something set in curl. So far its \r on all sites I've tried.

Edit 2: Doh. I think it's because in order to get the header into an array of lines, I exploded it on \n. So perhaps any \r\n are now just \r...

$c = explode("\n",$content);

2条回答
The star\"
2楼-- · 2019-07-29 23:36

The headers terminate with a double line break with no space in between (ie an empty line). A line break can be either "\n", "\r\n" or just "\r". Even though the latter is uncommon it still needs to be accounted for.

Perhaps you could find the end of the headers with a regular expression like

list($headers) = preg_split('/(\r\n?|\n)(\r\n?|\n)/', $httpresponse);
查看更多
ら.Afraid
3楼-- · 2019-07-29 23:41

You need to also check for "\r\n" and "\r", as those are also valid terminating empty lines.

When in canonical form, media subtypes of the "text" type use CRLF as the text line break. HTTP relaxes this requirement and allows the transport of text media with plain CR or LF alone representing a line break when it is done consistently for an entire entity-body. HTTP applications MUST accept CRLF, bare CR, and bare LF as being representative of a line break in text media received via HTTP.

-- HTTP/1.1: Protocol Parameters - 3.7.1 Canonicalization and Text Defaults

查看更多
登录 后发表回答