我有一个大小约为 10 GB 或更大的文件。该文件每行仅包含从 1 到 10 的数字,没有其他内容。现在的任务是从文件中读取数据[数字],然后按升序或降序对数字进行排序,并使用排序后的数字创建一个新文件。
你们中的任何人都可以帮我解答吗?
我有一个大小约为 10 GB 或更大的文件。该文件每行仅包含从 1 到 10 的数字,没有其他内容。现在的任务是从文件中读取数据[数字],然后按升序或降序对数字进行排序,并使用排序后的数字创建一个新文件。
你们中的任何人都可以帮我解答吗?
I'm assuming this is somekind of homework and goal for this is to sort more data than you can hold in your RAM?
Since you only have numbers 1-10, this is not that complicated task. Just open your input file and count how many occourances of every specific number you have. After that you can construct simple loop and write values into another file. Following example is pretty self explainatory.
$inFile = '/path/to/input/file';
$outFile = '/path/to/output/file';
$input = fopen($inFile, 'r');
if ($input === false) {
throw new Exception('Unable to open: ' . $inFile);
}
//$map will be array with size of 10, filled with 0-s
$map = array_fill(1, 10, 0);
//Read file line by line and count how many of each specific number you have
while (!feof($input)) {
$int = (int) fgets($input);
$map[$int]++;
}
fclose($input);
$output = fopen($outFile, 'w');
if ($output === false) {
throw new Exception('Unable to open: ' . $outFile);
}
/*
* Reverse array if you need to change direction between
* ascending and descending order
*/
//$map = array_reverse($map);
//Write values into your output file
foreach ($map AS $number => $count) {
$string = ((string) $number) . PHP_EOL;
for ($i = 0; $i < $count; $i++) {
fwrite($output, $string);
}
}
fclose($output);
Taking into account the fact, that you are dealing with huge files, you should also check script execution time limit for your PHP environment, following example will take VERY long for 10GB+ sized files, but since I didn't see any limitations concerning execution time and performance in your question, I'm assuming it is OK.
我以前也有类似的问题。试图操纵如此大的文件最终会消耗大量资源,并且无法应对。我最终得到的最简单的解决方案是尝试使用名为的快速数据转储函数将其导入 MySQL 数据库LOAD DATA INFILE
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
一旦它进入,您应该能够操纵数据。
或者,您可以只逐行读取文件,同时将结果与排序后的数字逐行输出到另一个文件中。不太确定这将如何运作。
你之前有没有尝试过,或者你只是在寻找一种可能的方法?
如果这就是你不需要 PHP 的全部(如果你手头有 Linux 机器):
sort -n file > file_sorted-asc
sort -nr file > file_sorted-desc
编辑:好的,这是您在 PHP 中的解决方案(如果您手头有 Linux 机器):
<?php
// Sort ascending
`sort -n file > file_sorted-asc`;
// Sort descending
`sort -nr file > file_sorted-desc`;
?>
:)