0

试图从网站解析一些电话号码。

虽然我通过 cURL 获取源代码,但我只取回了一半的代码,但缺少的部分正是我所需要的。这件事一直困扰着我。

到目前为止我的代码:

$ch = curl_init("http://www.baroul-bucuresti.ro/index.php?w=definitivi&l=C&p=2");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$content = curl_exec($ch);
curl_close($ch);
print_r ($content);
4

1 回答 1

2

我认为问题在于有问题的网址中有一个 302,将其重定向到另一个位置:

$ telnet www.baroul-bucuresti.ro 80
Trying 91.208.179.20...
Connected to www.baroul-bucuresti.ro.
Escape character is '^]'.
GET /index.php?w=definitivi&l=C&p=2 HTTP/1.1
host: www.baroul-bucuresti.ro

HTTP/1.1 302 Found
Date: Fri, 27 Apr 2012 20:24:54 GMT
Server: Apache/2.2.15 (CentOS)
X-Powered-By: PHP/5.3.3
Set-Cookie: PHPSESSID=qjbqvveqtmarv7o0f820bbeq71; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: for_tablou=1
Set-Cookie: bvbsessionhash=b9c609e162dab90fc86c1fdb52e07fdd; expires=Sun, 27-May-2012 20:24:57 GMT; path=/
Set-Cookie: bvblastvisit=1335558297; expires=Sun, 27-May-2012 20:24:57 GMT; path=/
Set-Cookie: bvblastactivity=1335558297; expires=Sun, 27-May-2012 20:24:57 GMT; path=/
Set-Cookie: bvbuserid=deleted; expires=Thu, 28-Apr-2011 20:24:56 GMT; path=/
Set-Cookie: for_tablou=1
Location: /tablou

我通过将此选项添加到 curl 来更改您的代码:

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

而且现在好像已经全部内容了..不知道是不是你想要的内容,但是它获取了真实位置的全部内容,你可以试试吗?

于 2012-04-27T20:28:48.650 回答