我曾经写过一个简单的“爬虫”来用 JAVA 为我下载 http 页面。现在我正在尝试使用 LWP 模块将相同的东西重写为 Perl。
这是我的 Java 代码(工作正常):
String referer = "http://example.com";
String url = "http://example.com/something/cgi-bin/something.cgi";
String params= "a=0&b=1";
HttpState initialState = new HttpState();
HttpClient httpclient = new HttpClient();
httpclient.setState(initialState);
httpclient.getParams().setCookiePolicy(CookiePolicy.NETSCAPE);
PostMethod postMethod = new PostMethod(url);
postMethod.addRequestHeader("Referer", referer);
postMethod.addRequestHeader("User-Agent", " Mozilla/5.0 (Windows; U; Windows NT 6.1; pl; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13");
postMethod.addRequestHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8");
postMethod.addRequestHeader("Content-Type", "application/x-www-form-urlencoded");
String length = String.valueOf(params.length());
postMethod.addRequestHeader("Content-Length", length);
postMethod.setRequestBody(params);
httpclient.executeMethod(postMethod);
这是 Perl 版本:
my $referer = "http://example.com/something/cgi-bin/something.cgi?module=A";
my $url = "http://example.com/something/cgi-bin/something.cgi";
my @headers = (
'User-Agent' => 'Mozilla/5.0 (Windows; U; Windows NT 6.1; pl; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Referer' => $referer,
'Content-Type' => 'application/x-www-form-urlencoded',
);
my @params = (
'a' => '0',
'b' => '1',
);
my $browser = LWP::UserAgent->new( );
$browser->cookie_jar({});
$response = $browser->post($url, @params, @headers);
print $response->content;
发布请求正确执行,但我得到另一个(主)网页。好像 cookie 不能正常工作......
任何猜测什么是错的?为什么我从 JAVA 和 perl 程序中得到不同的结果?