10

这是我当前的代码:

    $SQL = mysql_query("SELECT url FROM urls") or die(mysql_error()); //Query the urls table
while($resultSet = mysql_fetch_array($SQL)){ //Put all the urls into one variable

                // Now for some cURL to run it.
            $ch = curl_init($resultSet['url']); //load the urls
            curl_setopt($ch, CURLOPT_TIMEOUT, 2); //No need to wait for it to load. Execute it and go.
            curl_exec($ch); //Execute
            curl_close($ch); //Close it off 
        } //While loop

我对 cURL 比较陌生。相对较新,我的意思是这是我第一次使用 cURL。目前它加载一个两秒,然后加载下一个 2 秒,然后下一个。但是,我想让它同时加载所有这些。我确定它可能,我只是不确定如何。如果有人能指出我正确的方向,我将不胜感激。

4

1 回答 1

8

您以相同的方式设置每个 cURL 句柄,然后将它们添加到curl_multi_句柄中。要查看的curl_multi_*函数是此处记录的函数。不过,根据我的经验,尝试一次加载太多 URL 时会出现问题(尽管目前我找不到我的笔记),所以我上次使用 时curl_mutli_,我将其设置为批量 5 URL 一次。

编辑:这是我使用的代码的简化版本curl_multi_

编辑:稍微重写并添加了很多评论,希望会有所帮助。

// -- create all the individual cURL handles and set their options
$curl_handles = array();
foreach ($urls as $url) {
    $curl_handles[$url] = curl_init();
    curl_setopt($curl_handles[$url], CURLOPT_URL, $url);
    // set other curl options here
}

// -- start going through the cURL handles and running them
$curl_multi_handle = curl_multi_init();

$i = 0; // count where we are in the list so we can break up the runs into smaller blocks
$block = array(); // to accumulate the curl_handles for each group we'll run simultaneously

foreach ($curl_handles as $a_curl_handle) {
    $i++; // increment the position-counter

    // add the handle to the curl_multi_handle and to our tracking "block"
    curl_multi_add_handle($curl_multi_handle, $a_curl_handle);
    $block[] = $a_curl_handle;

    // -- check to see if we've got a "full block" to run or if we're at the end of out list of handles
    if (($i % BLOCK_SIZE == 0) or ($i == count($curl_handles))) {
        // -- run the block

        $running = NULL;
        do {
            // track the previous loop's number of handles still running so we can tell if it changes
            $running_before = $running;

            // run the block or check on the running block and get the number of sites still running in $running
            curl_multi_exec($curl_multi_handle, $running);

            // if the number of sites still running changed, print out a message with the number of sites that are still running.
            if ($running != $running_before) {
                echo("Waiting for $running sites to finish...\n");
            }
        } while ($running > 0);

        // -- once the number still running is 0, curl_multi_ is done, so check the results
        foreach ($block as $handle) {
            // HTTP response code
            $code = curl_getinfo($handle,  CURLINFO_HTTP_CODE);

            // cURL error number
            $curl_errno = curl_errno($handle);

            // cURL error message
            $curl_error = curl_error($handle);

            // output if there was an error
            if ($curl_error) {
                echo("    *** cURL error: ($curl_errno) $curl_error\n");
            }

            // remove the (used) handle from the curl_multi_handle
            curl_multi_remove_handle($curl_multi_handle, $handle);
        }

        // reset the block to empty, since we've run its curl_handles
        $block = array();
    }
}

// close the curl_multi_handle once we're done
curl_multi_close($curl_multi_handle);

鉴于您不需要从 URL 中返回任何内容,您可能不需要很多内容,但这就是我将请求分块为 的块BLOCK_SIZE,等待每个块在继续之前运行并捕获错误的方式从卷曲。

于 2010-04-22T16:41:40.567 回答