0

我的 urls 数组包含大约 100 个 url,并且正在从为类似的东西设计的 API 中获取信息。无论如何,就像 50% 的网址返回

400 Bad Request


Your browser sent a request that this server could not understand.
GET /player/euw/Wolves Deficio/ingame HTTP/1.1
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1
Host: api.captainteemo.com
Accept: */*

网址是 100% 正确的,因为当我将它们复制到浏览器时,我确实得到了信息。

这是我的代码:

function get_data($urls) {
    // spoofing FireFox 2.0
    $useragent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";

    $ch = curl_init();

    // set user agent
    curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
    if(is_array($urls)) {
        $output = array();
        foreach($urls as $url) {
            // set the rest of your cURL options here
            curl_setopt($ch, CURLOPT_URL, $url); 
            //return the transfer as a string 
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
            // $output contains the output string 
            array_push($output, curl_exec($ch));
        }
    }
    else {
        // set the rest of your cURL options here
        curl_setopt($ch, CURLOPT_URL, $urls); 
        //return the transfer as a string 
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
        // $output contains the output string 
        $output = curl_exec($ch);

    }
    // close curl resource to free up system resources 
    curl_close($ch);    

    return $output;
}
4

2 回答 2

0

我在重定向的情况下添加了 CURLOPT_FOLLOWLOCATION

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

尝试这个:

<?php

function get_data($urls) 
{
    if ( !is_array( $urls ) ) 
        $urls = array( '0' => $urls );

    $useragent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

    $output = array();

    foreach($urls as $url)
    {
        curl_setopt($ch, CURLOPT_URL, $url); 
        $output[$url] = curl_exec($ch);
    }

    curl_close($ch);

    return $output;
}

$urls = array(
                'http://www.wikipedia.org/',
                'http://www.stackoverflow.com'
             );

//or single url
//$urls = 'http://www.stackoverflow.com';

print_r ( get_data($urls) );
于 2013-04-11T16:38:34.350 回答
0

问题是我欺骗了用户代理而不是使用真实的,也没有用“+”替换“”。

于 2013-04-12T10:40:19.173 回答