0

编辑:这个问题结束时的优化结果!

嗨,我有以下代码首先扫描特定文件夹中的文件,然后逐行读取每个文件,并在无数“if ... else if”之后将新修改的文​​件写入另一个文件夹,名称与打开时一样.

问题是逐行编写文件似乎非常缓慢。默认的 60 秒限制仅能容纳 25 个左右的文件。文件大小从 10k 到 350k 不等。

任何优化代码以使其运行得更快的方法。逐行读取是否更好,将每一行放入一个数组中,然后将整个数组写入一个新的文本文件(而不是逐行读取/写入)。如果是,它在实践中是如何完成的。

在此先感谢 ----- 代码如下 -----

<?php

function scandir_recursive($path)    {
...
...
}



$fileselection = scandir_recursive('HH_new');
foreach ($fileselection as $extractedArray) {
$tableName = basename($extractedArray); // Table name
$fileLines=file($extractedArray);
    foreach ($fileLines as $line) {
            if(preg_match('/\(all-in\)/i' , $line)) {
                $line = stristr($line, ' (all-in)', true) .', and is all in';
                $allin = ', and is all in';
            }
            else {
                $allin = '';
            }
            if(preg_match('/posts the small blind of \$[\d\.]+/i' , $line)) {
                $player = stristr($line, ' posts ', true);
                $betValue = substr(stristr($line, '$'), 1);
                $bettingMatrix[$player]['betTotal'] = $betValue;
            }
            else if(preg_match('/posts the big blind of \$[\d\.]+/i' , $line)) {
                $player = stristr($line, ' posts ', true);
                $betValue = substr(stristr($line, '$'), 1);
                $bettingMatrix[$player]['betTotal'] = $betValue;
            }
            else if(preg_match('/\S+ raises /i' , $line)) {
                $player = stristr($line, ' raises ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
            }
            else if(preg_match('/\S+ bets /i' , $line)) {
                $player = stristr($line, ' bets ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
            }
            else if(preg_match('/\S+ calls /i' , $line)) {
                $player = stristr($line, ' calls ', true);
                $betValue = substr(stristr($line, '$'), 1);
                $callValue = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount called
                $bettingMatrix[$player]['betTotal'] = $betValue;
                $line = stristr($line, '$', true)."\$".$callValue.$allin;
                $allin = '';
            }
            else if(preg_match('/(\*\*\* (Flop|Turn|River))|(Full Tilt Poker)/i' , $line)) {
                unset($bettingMatrix); //zero $betValue
            }
            else if(preg_match('/\*\*\* FLOP \*\*\*/i' , $line)) {
                $flop = substr(stristr($line, '['), 0, -2);
                $line = '*** FLOP *** '. $flop;
            }
            else if(preg_match('/\*\*\* TURN \*\*\*/i' , $line)) {
                $turn = substr(stristr($line, '['), 0, -2);
                $line = '*** TURN *** '. $flop .' '. $turn;
            }
            else if(preg_match('/\*\*\* RIVER \*\*\*/i' , $line)) {
                $river = substr(stristr($line, '['), 0, -2);
                $line = '*** RIVER *** '. substr($flop, 0, -1) .' '. substr($turn, 1) .' '. $river;
            }
            else {
            }
        $ourFileHandle = fopen("HH_newest/".$tableName.".txt", 'a') or die("can't open file");
        fwrite($ourFileHandle, $line);
        fclose($ourFileHandle);
    }
}
?>

编辑:根据这里的每个人给我的提示重写代码后,这是非常有趣的结果。

60 个文本文件,共 5.8MB

经过所有优化(在循环之前更改了 preg->strpos/strstr 和 $handle):4 秒。

如上所述,但更改了 strpos/strstr -> stripos/stristr:8 秒。

如上所述,但更改了 stripos/stristr -> preg:12 秒。

如上所述,但在循环内更改了 fopen:180 秒运行限制后的 45/60 个文件

这是完整的脚本:

$fileselection = scandir_recursive('HH_new');
foreach ($fileselection as $extractedArray) {
    $tableName = basename($extractedArray); // Table name
    $handle         = fopen($extractedArray, 'r');
    $ourFileHandle  = fopen("HH_newest/".$tableName.".txt", 'a') or die("can't open file");
    while ($line = fgets($handle)) {
            if (FALSE !== strpos($line, '(all-in)')) {
                $line = strstr($line, ' (all-in)', true) .", and is all in\r\n";
                $allin = ', and is all in';
            } else {
                $allin = '';
            }
            if (FALSE !== strpos($line, ' posts the small blind of $')) {
                $player = strstr($line, ' posts ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $bettingMatrix[$player]['betTotal'] = $betValue;
            }
            else if (FALSE !== strpos($line, ' posts the big blind of $')) {
                $player = strstr($line, ' posts ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $bettingMatrix[$player]['betTotal'] = $betValue;
            }
            else if (FALSE !== strpos($line, ' posts $')) {
                $player = strstr($line, ' posts ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $bettingMatrix[$player]['betTotal'] += $betValue;
            }
            else if (FALSE !== strpos($line, ' raises to $')) {
                $player = strstr($line, ' raises ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $betMade = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount raised by
                $bettingMatrix[$player]['betTotal'] = $betValue; //$line contains total bet this hand (shortcut)
            }
            else if (FALSE !== strpos($line, ' bets $')) {
                $player = strstr($line, ' bets ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $betMade = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount raised by
                $bettingMatrix[$player]['betTotal'] = $betValue; //$line contains total bet this hand (shortcut)
            }
            else if (FALSE !== strpos($line, ' calls $')) {
                $player = strstr($line, ' calls ', true);
                $betValue = substr(strstr($line, '$'), 1);
                $callValue = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount called
                $bettingMatrix[$player]['betTotal'] = $betValue;
                $line = strstr($line, '$', true)."\$".$callValue.$allin. "\r\n";
                $allin = '';
            }
            else if (FALSE !== strpos($line, '*** FLOP ***')) {
                $flop = substr(strstr($line, '['), 0, -2);
                unset($bettingMatrix); //zero $betValue
            }
            else if (FALSE !== strpos($line, '*** TURN ***')) {
                $turn = substr(strstr($line, '['), 0, -2);
                $line = '*** TURN *** '.$flop.' '.$turn."\r\n";
                unset($bettingMatrix); //zero $betValue
            }
            else if (FALSE !== strpos($line, '*** RIVER ***')) {
                $river = substr(strstr($line, '['), 0, -2);
                $line = '*** RIVER *** '. substr($flop, 0, -1) .' '. substr($turn, 1) .' '. $river."\r\n";
                unset($bettingMatrix); //zero $betValue
            }
            else if (FALSE !== strpos($line, 'Full Tilt Poker')) {
                unset($bettingMatrix); //zero $betValue
            }
            else {
            }
        fwrite($ourFileHandle, $line);
    }
    fclose($handle);
    fclose($ourFileHandle);
}
4

4 回答 4

5

我认为这是因为你在循环中打开/关闭文件,尝试在 foreach 之前移动 fopen() 并在它之后移动 fclose

于 2009-11-13T16:16:25.657 回答
4

我怀疑文件写入是这里的性能问题。您正在对所有内容运行十个正则表达式!

使用strpos之类的字符串方法来查找子字符串可能会加快速度。

于 2009-11-13T16:10:24.553 回答
2

取消正则表达式会给你带来最大的性能提升,如果你可以将它们更改为strpos()或类似的 - stripos() 不区分大小写 - 你应该注意到速度的提高。

测试需要是'!== false',因为找到的字符串可能位于位置 0。例如,您的第一个测试用例可能是 ():

if(stripos($line, '(all-in)') !== false) {
    //generate output
}

您还可能会发现使用 fgets() 而不是一次读取整个文件可能会给您带来一些性能提升(但这更多是内存问题)。正如其他人所提到的,只在循环中写入文件,不要打开和关闭它。

于 2009-11-13T16:15:06.983 回答
1

这是您的代码,有一些微小的更改,应该会有所帮助

  1. 从 切换file()fgets()。这一次只会将一行加载到内存中,而不是文件中的每一行。
  2. 将您的电话更改preg_match()stripos()适用的地方。应该快一点
  3. 将打开/关闭的移动$ourFileHandle到外循环中。这将显着减少对文件系统的统计调用次数,并应大大加快速度。

可能还有很多其他优化可以在那个可怕的 if..else 中进行,但我会把这些留给另一个 SOer(或你)

$fileselection = scandir_recursive('HH_new');
foreach ($fileselection as $extractedArray)
{ 
  $tableName     = basename( $extractedArray ); // Table name
  $handle        = fopen( $extractedArray, 'r' );
  $ourFileHandle = fopen("HH_newest/".$tableName.".txt", 'a') or die("can't open file");

  while ( $line = fgets( $handle ) )
  {
    if ( false !== stripos( $line, '(all-in)' ) )
    {
      $line = stristr($line, ' (all-in)', true) .', and is all in';
      $allin = ', and is all in';
    } else {
      $allin = '';
    }
    if ( preg_match('/posts the small blind of \$[\d\.]+/i' , $line ) )
    {
            $player = stristr($line, ' posts ', true);
            $betValue = substr(stristr($line, '$'), 1);
            $bettingMatrix[$player]['betTotal'] = $betValue;
    }
    else if(preg_match('/posts the big blind of \$[\d\.]+/i' , $line)) {
            $player = stristr($line, ' posts ', true);
            $betValue = substr(stristr($line, '$'), 1);
            $bettingMatrix[$player]['betTotal'] = $betValue;
    }
    else if(preg_match('/\S+ raises /i' , $line)) {
            $player = stristr($line, ' raises ', true);
            $betValue = substr(strstr($line, '$'), 1);
            $bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
    }
    else if(preg_match('/\S+ bets /i' , $line)) {
            $player = stristr($line, ' bets ', true);
            $betValue = substr(strstr($line, '$'), 1);
            $bettingMatrix[$player]['betTotal'] = $betValue; //total bet this hand (shortcut)
    }
    else if(preg_match('/\S+ calls /i' , $line)) {
            $player = stristr($line, ' calls ', true);
            $betValue = substr(stristr($line, '$'), 1);
            $callValue = $betValue - $bettingMatrix[$player]['betTotal']; //actual amount called
            $bettingMatrix[$player]['betTotal'] = $betValue;
            $line = stristr($line, '$', true)."\$".$callValue.$allin;
            $allin = '';
    }
    else if(preg_match('/(\*\*\* (Flop|Turn|River))|(Full Tilt Poker)/i' , $line)) {
            unset($bettingMatrix); //zero $betValue
    }
    else if ( FALSE !== stripos( $line, '*** FLOP ***' ) )
    {
            $flop = substr(stristr($line, '['), 0, -2);
            $line = '*** FLOP *** '. $flop;
    }
    else if ( FALSE !== stripos( $line, '*** TURN ***' ) )
    {
            $turn = substr(stristr($line, '['), 0, -2);
            $line = '*** TURN *** '. $flop .' '. $turn;
    }
    else if ( FALSE !== stripos( $line, '*** RIVER ***' ) )
    {
            $river = substr(stristr($line, '['), 0, -2);
            $line = '*** RIVER *** '. substr($flop, 0, -1) .' '. substr($turn, 1) .' '. $river;
    }
    else {
    }
    fwrite($ourFileHandle, $line);
  }
  fclose( $handle );
  fclose( $ourFileHandle );
}
于 2009-11-13T16:36:14.290 回答