如果您愿意,请随时使用我放在一起的这个脚本来从我们工作的数据仓库中提取大型(有时以千兆字节为单位)CSV 文件。它几乎不使用系统资源,并且足够轻巧,既美观又快速。我从命令行界面运行它,因此编写输出以进行漂亮的简单显示。
它将检查连接以确保它可以连接,然后确保查询可以运行(即没有语法错误)并将数据提取到 CSV 文件中,以便您将查询列名称作为文件的第一行. 它会在运行时让您了解它所在的行(最接近的千位),因此如果您知道结果集的预期大小,您可以猜测还剩下多少。
<?php
// I have this set up for a few databases, so I keep all my usernames, passwords and extra options.
// DWH
$dwhusername = 'user';
$dwhpassword = 'pass';
$dwhhostname = 'oci:dbname=databaseName';
$options = array(PDO::ATTR_AUTOCOMMIT=>FALSE);
// EDW
$edwusername = 'user';
$edwpassword = 'pass';
$edwhostname = 'oci:dbname=databaseName';
// MySQL
$musername = 'user';
$mpassword = 'pass';
$mhostname = 'mysql:host=localhost;dbname=databaseName';
// Feel free to set this to anything you like. It is just used to give you a visual start and end time.
date_default_timezone_set('Australia/Sydney');
// Enter the filename that you want to save to below.
$File = "C:\Server\yourFileName.csv";
// Enter your SQL query that you use here.
$sql="
select
col1 as Column1,
col2 as Sales,
col3 as TradeDate
from
myTable
where
col1=someCondition;
";
function time_diff_conv($start, $s)
{
$string="";
$t = array( //suffixes
'd' => 86400,
'h' => 3600,
'm' => 60,
);
$s = abs($s - $start);
foreach($t as $key => &$val) {
$$key = floor($s/$val);
$s -= ($$key*$val);
$string .= ($$key==0) ? '' : $$key . "$key ";
}
return $string . $s. 's';
}
echo "\n\nStarting extract job on ".date('l jS \of F Y \a\t h:i:s A')."\n";
$Handle = fopen($File, 'w');
try{
$dbh = new PDO($dwhhostname, $dwhusername, $dwhpassword, $options);
//$dbh = new PDO($edwhostname, $edwusername, $edwpassword, $options);
//$dbh = new PDO($mhostname, $musername, $mpassword);
echo "Connection to database appears fine. Running query.\n";
$timeQuery=time();
$sth = $dbh->prepare($sql, array(PDO::ATTR_CURSOR => PDO::CURSOR_FWDONLY));
$dbh->beginTransaction();
$stmt = $dbh->query($sql);
$Data="";
$data2="";
$obj = $stmt->fetch(PDO::FETCH_ASSOC);
foreach($obj as $key => $value)
{
$Data.=$key.",";
$data2.=$value.",";
}
$i++;
$Data=substr($Data,0,-1);
$Data.="\n".substr($data2,0,-1)."\n";
$data2=null;
fwrite($Handle, $Data);
$timeResult=time();
echo "Query Result Returned on ".date('l jS \of F Y \a\t h:i:s A')."\nExporting data now...\n\n";
$Data="";
while($obj = $stmt->fetch(PDO::FETCH_ASSOC))
{
foreach($obj as $key => $value)
{
$Data.=$value.",";
}
$Data=substr($Data,0,-1);
$Data.="\n";
if($i%1000==0)
{
fwrite($Handle, $Data);
$Data="";
echo "\rWritten $i rows of data to the file.\r";
}
$i++;
}
fwrite($Handle, $Data);
$Data="";
$stmt=null;
$dbh->commit();
$dbh=null;
}
catch(PDOException $e){
echo 'Error : '.$e->getMessage();
exit();
}
$timeComplete=time();
$timeQueryRes=time_diff_conv($timeQuery, $timeResult);
$timeResRes=time_diff_conv($timeResult, $timeComplete);
$timeAverage=number_format(round($i/($timeComplete-$timeResult+1),0));
echo "Query took ".$timeQueryRes." to return a result\n";
echo "The resultset of $i rows took ".$timeResRes." to completely extract at an average of ".$timeAverage." rows per second.\n";
fclose($Handle);
echo "Data written to: ".$File."\n";
echo "Finished extract job on ".date('l jS \of F Y \a\t h:i:s A')."\n\n\n";
?>
终端/控制台中的输出如下所示:
U:\>c:\server\wamp\bin\php\php5.3.0\php.exe -f "C:\server\wamp\www\store\inc\out
putData2.php"
Starting extract job on Thursday 19th of July 2012 at 04:25:44 PM
Connection to database appears fine. Running query.
Query Result Returned on Thursday 19th of July 2012 at 04:29:28 PM
Exporting data now...
Query took 3m 43s to return a resultfile.
The resultset of 1341447 rows took 4m 37s to completely extract at an average of
4,825 rows per second.
Data written to: C:\Server\DailyStoreSales-2012-06-27.csv
Finished extract job on Thursday 19th of July 2012 at 04:34:05 PM
U:\>