我正在检查这个搬运工词干分析器。下面他们说我应该改变我的第一行。我到底做了什么,但词干分析器不起作用。一个好的例子可能是什么?
#!/usr/local/bin/perl -w
#
# Perl implementation of the porter stemming algorithm
# described in the paper: "An algorithm for suffix stripping, M F Porter"
# http://www.muscat.com/~martin/stem.html
#
# Daniel van Balen (vdaniel@ldc.usb.ve)
#
# October-1999
#
# To Use:
#
# Put the line "use porter;" in your code. This will import the subroutine
# porter into your current name space (by default this is Main:: ). Make
# sure this file, "porter.pm" is in your @INC path (it includes the current
# directory).
# Afterwards use by calling "porter(<word>)" where <word> is the word to strip.
# The stripped word will be the returned value.
#
# REMEMBER TO CHANGE THE FIRST LINE TO POINT TO THE PATH TO YOUR PERL
# BINARY
#
作为代码,我正在编写以下内容:
use Lingua::StopWords qw(getStopWords);
use Main::porter;
my $stopwords = getStopWords('en');
@stopwords = grep { $stopwords->{$_} } (keys %$stopwords);
chdir("c:/perl/input");
@files = <*>;
foreach $file (@files)
{
open (input, $file);
while (<input>)
{
open (output,">>c:/perl/normalized/".$file);
chomp;
porter<$_>;
for my $stop (@stopwords)
{
s/\b\Q$stop\E\b//ig;
}
$_ =~s/<[^>]*>//g;
$_ =~ s/[[:punct:]]//g;
print output "$_\n";
}
}
close (input);
close (output);
该代码没有给出任何错误,只是它没有阻止任何东西!!!