-1

我需要从文档类型声明开始直到结束,这就是我所拥有的:

HTTP/1.1 200 OK
Server: nginx
Date: Wed, 19 Sep 2012 07:52:41 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Keep-Alive: timeout=20
Status: 200 OK
X-Runtime: 736
ETag: "66644f063945c4d3f6e5471723306c2c"
Cache-Control: no-cache
Set-Cookie: vrid=e93aae30-e45c-012f-9786-001f29cc11ee; domain=.yellowpages.com; path=/; expires=Tue, 19-Sep-2017 07:52:40 GMT
Set-Cookie: parity_analytics=---+%0A%3Avisit_id%3A+u1ncewtmt44s23myff9s5t1tbcd5h%0A%3Avisit_start_time%3A+2012-09-19+07%3A52%3A40.711499+%2B00%3A00%0A%3Alast_page_load%3A+2012-09-19+07%3A52%3A40.711501+%2B00%3A00%0A; path=/; expires=Sat, 19-Sep-2037 07:52:40 GMT
Set-Cookie: _parity_session=BAh7CDoPc2Vzc2lvbl9pZCIlMDIxNjZiMDVkZmMxNWFmMzQ5OGVlNTk3Njg0MTM2NmY6EF9jc3JmX3Rva2VuSSIxbVhHMGNmM1U1K3E1OFo2NTQwVHltTFdZaHREa1lMMnRCVnE1eVFJNFpHQT0GOgZFRjoTZGV4X3Nlc3Npb25faWRJIillOTg5ZjNlMC1lNDVjLTAxMmYtOTc4Yy0wMDFmMjljYzExZWUGOwdG--08b9db1ba698882287f47a60e34c0c1e227d440a; path=/; HttpOnly
X-Rid: vendetta-ac8a22f2-3a5c-4da8-a1e8-25d4a550bf32
Expires: Wed, 19 Sep 2012 07:52:40 GMT

<!DOCTYPE html><head></head><body></body>...

这可以用正则表达式来实现吗?

任何帮助表示赞赏。

4

1 回答 1

2

您可以使用http_parse_headers

这是示例:

$headers = substr($yourString, 0, strpos($yourString, '<!DOCTYPE'));
print_r(http_parse_headers($headers));

这是使用 Regex 的文档中的有用功能(如果您无权访问 PECL 库):

function http_parse_headers( $header )
{
    $retVal = array();
    $fields = explode("\r\n", preg_replace('/\x0D\x0A[\x09\x20]+/', ' ', $header));
    foreach( $fields as $field ) {
        if( preg_match('/([^:]+): (.+)/m', $field, $match) ) {
            $match[1] = preg_replace('/(?<=^|[\x09\x20\x2D])./e', 'strtoupper("\0")', strtolower(trim($match[1])));
            if( isset($retVal[$match[1]]) ) {
                $retVal[$match[1]] = array($retVal[$match[1]], $match[2]);
            } else {
                $retVal[$match[1]] = trim($match[2]);
            }
        }
    }
    return $retVal;
}
于 2012-09-19T08:09:18.747 回答