像这样的东西可能会这样做 - 请注意,如果您像这样转换字符串,您可能会引入重复项。
my $input ="Parks and Recreation - S05E01 - Ms. Knope Goes to Washington";
$input =~ s/ - /_/g; # Replace all " - " with "_"
$input =~ s/[^A-Za-z0-9]/_/g; # Replace all non-alphanumericals with "_"
print $input;
这输出:
Parks_and_Recreation_S05E01_Ms__Knope_Goes_to_Washington
编辑
下面的Érics评论非常相关,这里有一个更好的方法,在进行替换之前用非重音符号替换重音字符:
use utf8;
use Unicode::Normalize;
my $input="La femme d'à côté";
my $result = NFD($input); # Unicode normalization Form D (NFD), canonical decomposition.
$result !~ s/[^[:ascii:]]//g; # Remove all non-ascii.
$result =~ s/ - /_/g; # Replace all " - " with "_"
$result =~ s/[^A-Za-z0-9]/_/g; # Replace all non-alphanumericals with _
print $result;
此变体输出:
La_femme_d_a_cote