text - 使用 Perl6 处理大文本文件，速度太慢。(2014-09)

Question

https://github.com/yeahnoob/perl6-perf中的代码宿主，如下：

use v6;

my $file=open "wordpairs.txt", :r;

my %dict;
my $line;

repeat {
    $line=$file.get;
    my ($p1,$p2)=$line.split(' ');
    if ?%dict{$p1} {
        %dict{$p1} = "{%dict{$p1}} {$p2}".words;
    } else {
        %dict{$p1} = $p2;
    }
} while !$file.eof;

当“wordpairs.txt”很小时运行良好。

但是当“wordpairs.txt”文件大约有 140,000 行（每行，两个单词）时，它运行非常非常慢。即使运行了 20 秒，它也无法完成自身。

它有什么问题？代码有问题吗？？感谢任何人的帮助！

以下内容已添加@ 2014-09-04，感谢 SE Answers 和 IRC@freenode#perl6 的许多建议

代码（现在，2014-09-04）：

my %dict;
grammar WordPairs {
token word-pair { (\S*) ' ' (\S*) "\n" }
token TOP { <word-pair>* }
}
class WordPairsActions {
method word-pair($/) { %dict{$0}.push($1) }
}
my $match = WordPairs.parse(slurp, :actions(WordPairsActions));
say ?$match;

运行时间成本（目前）：

$ time perl6 countpairs.pl wordpairs.txt
True
The pairs count of the key word "her" in wordpairs.txt is 1036

real    0m24.043s
user    0m23.854s
sys     0m0.181s

$ perl6 --version
This is perl6 version 2014.08 built on MoarVM version 2014.08

这个测试的时间性能目前不合理（因为相同的正确 Perl 5 代码只花费大约 160 毫秒），但比我原来的旧 Perl6 代码要好得多。:)

PS。整个东西，包括原始测试代码、补丁和示例文本，都在 github 上。

score 3 · Accepted Answer

我已经使用与 Christoph 非常相似的代码进行了测试，使用包含 10,000 行的文件。大约需要 15 秒，正如您所说，这比 Perl 5 慢得多。我怀疑代码很慢，因为这段代码使用的东西没有看到 Rakudo 和 MoarVM 的其他部分最近收到的优化工作量。我确信代码的性能将在接下来的几个月中显着提高，因为任何缓慢的东西都会受到更多关注。

当试图确定为什么某些 Perl 6 代码很慢时，我建议在 MoarVM 上使用 --profile 运行 perl6 以查看它是否可以帮助您找到瓶颈。不幸的是，使用此代码，它将指向 rakudo 内部结构，而不是您可以改进的任何东西。

在 irc.freenode.net 上与#perl6 交谈当然值得，因为他们将拥有提供替代解决方案的知识，并且能够在未来提高其性能。

score 2 · Accepted Answer

Rakudo 并不以其出色的表现而闻名。

使用更惯用的代码可能有帮助，也可能没有帮助：

my %dict;
for open('wordpairs.txt', :r).lines {
    my ($key, @words) = .words;
    push %dict{$key}, @words;
}

您还可以检查其他后端（Rakudo 在 MoarVM、Parrot 和 JVM 上运行），看看它是否在所有地方都同样慢。

知道是 IO 还是处理速度很慢会很有趣，例如通过

my %dict;

say 'start IO';
my @lines = eager open('wordpairs.txt', :r).lines;
say 'done IO';

say 'start processing';
for @lines { ... }
say 'done processing';

如果您想自己深入研究这个问题，我相信还有一个可用的分析器。

text - 使用 Perl6 处理大文本文件，速度太慢。(2014-09)

以下内容已添加@ 2014-09-04，感谢 SE Answers 和 IRC@freenode#perl6 的许多建议

2 回答 2

Related

Reference