我想读入一个文件,格式为:“string1 string2 string3”并用以下规则替换几个字符(但每把椅子都应该替换一次):tsch=> tch, ch> h , ki=> ky (但如果 ki 位于“单词”的末尾)所以“tschaiki”应该变得 tchaiky 而不是 thaiky (这在使用 for 循环或几个单个替换命令时发生)
我知道这个问题之前被问过,并通过在 perl 中创建一个哈希来解决。
$line=<>
my %replace =(j=> "y", ss=> "s", u=> "ou", tsch=> "ch"); #short versions of the rules
my $regex = join "|", keys %replace;
$regex = qr/$regex/;
$line=~s/($regex)/$replace{$1}/g;
到目前为止,这对我也有效,但我希望某些字符只能在字符串末尾替换。但这会导致问题:我已经用第二个正则表达式和哈希扩展了之前的代码,仅用于结尾:
my %replace_end =(ia=> "iya", ki=> "ky",ei=> "ey" );
my $regex_end = join "|", keys %replace_end;
$regex_end = qr/$regex_end/;
$line=~s/($regex_end)$/$replace_end{$1}/g; # saying just to substitute at the end
我的整个代码如下,但是无论是异常还是结尾都被忽略了(我认为没有文件处理和while循环的代码确实有效):
#!/usr/bin/perl
use strict;
use warnings;
open(INP,"<:utf8","dt_namen.txt");
open(OUT,">:utf8","dt_zu_engl.txt");
my %replace =(j=> "y", ss=> "s", tsch=> "ch", sch => "sh", c => "k", J="Y", Ss=>"s");
my $regex = join "|", keys %replace;
$regex = qr/$regex/;
my %replace_end =(ki=> "ky",ei=> "ey" );
my $regex_end = join "|", keys %replace_end;
$regex_end = qr/$regex_end/;
while(my $line= <INP>){
$line=~s/($regex)/$replace{$1}/g;
$line=~s/($regex_end)$/$replace_end{$1}/g; # saying just to substitute at the end
print $line;
print OUT "$line";
}
close INP;
close OUT;