0

我正在尝试使用 curl 访问一些带有非拉丁字符的 url,问题是当我访问时我没有得到响应。我的浏览器访问它们没有问题,我检查了字符串转换,似乎我正在访问“http://www.linkedin.com/pub/j-rgen-a-tr-ff/7/606/68a”,而我的浏览器访问“http://se.linkedin.com/pub/j%C3%B6rgen-a-tr%C3%A4ff/7/606/68a”如何转换该字符串以使 curl 成功?

function hitFormGet($loginURL, $loginFields, $referer,$cookieString)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt");
    curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt");

    //curl_setopt($ch,    CURLOPT_AUTOREFERER,         true);
    curl_setopt($ch,    CURLOPT_COOKIESESSION,         true);
    //curl_setopt( $ch, CURLOPT_COOKIE,$cookieString);
    curl_setopt($ch,    CURLOPT_FAILONERROR,         false);
    curl_setopt($ch,    CURLOPT_FOLLOWLOCATION,        false);
    curl_setopt($ch, CURLOPT_VERBOSE, 1 );
    curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate,sdch');
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($ch,    CURLOPT_FRESH_CONNECT,         true);
    curl_setopt($ch,    CURLOPT_HEADER,             false);
    //curl_setopt($ch,    CURLOPT_POST,                 true);
    curl_setopt($ch,    CURLOPT_RETURNTRANSFER,        true);
    curl_setopt($ch,    CURLOPT_CONNECTTIMEOUT,     30);
    curl_setopt($ch,    CURLOPT_USERAGENT, "Googlebot/2.1 (+http://www.googlebot.com/bot.html)");

    curl_setopt($ch, CURLOPT_URL, $loginURL.$loginFields);

    curl_setopt($ch, CURLOPT_REFERER, $referer);

    //curl_setopt($ch, CURLOPT_POSTFIELDS, $loginFields);
    $ret = curl_exec($ch);
    curl_close($ch);
    return $ret;
}


$res=hitFormGet("http://se.linkedin.com/pub/j%C3%B6rgen-a-tr%C3%A4ff/7/606/68a","","","");
4

1 回答 1

2

您好像是从瑞典访问linkedin。这就是您被重定向到 se.linkedin.com 的原因。要按预期转换 URL,您可以在动态 url 部分应用 urlencode(),如您在以下示例中的示例:j-rgen-a-tr-ff/7/606/68a。

它应该工作。

于 2012-04-26T08:33:14.750 回答