我在做什么?该脚本从 .txt (locations.txt) 加载一个字符串,并将其分成 6 个变量。每个变量用逗号分隔。然后我去一个网站,它的地址取决于这 6 个值。
问题是什么?如果在位置.txt 中作为字符串的一部分的变量中有空格作为字符。当有空格时,它不会得到正确的 url。
输入文件是:
locations.txt = Heinz,Weber,Sierra Leone,1915,M,White
因为塞拉利昂有空格,所以网址是:
https://familysearch.org/search/collection/results#count=20&query=%2Bgivenname%3AHeinz%20%2Bsurname%3AWeber%20%2Bbirth_place%3A%22Sierra%20Leone%22%20%2Bbirth_year%3A1914-1918~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219
但这在下面的代码中没有得到正确处理。
我正在使用这些软件包:
use strict;
use warnings;
use WWW::Mechanize::Firefox;
use HTML::TableExtract;
use Data::Dumper;
use LWP::UserAgent;
use JSON;
use CGI qw/escape/;
use HTML::DOM;
这是代码的开头:
open(my $l, 'locations26.txt') or die "Can't open locations: $!";
open(my $o, '>', 'out2.txt') or die "Can't open output file: $!";
while (my $line = <$l>) {
chomp $line;
my %args;
@args{qw/givenname surname birth_place birth_year gender race/} = split /,/, $line;
$args{birth_year} = ($args{birth_year} - 2) . '-' . ($args{birth_year} + 2);
my $mech = WWW::Mechanize::Firefox->new(create => 1, activate => 1);
$mech->get("https://familysearch.org/search/collection/results#count=20&query=%2Bgivenname%3A".$args{givenname}."%20%2Bsurname%3A".$args{surname}."%20%2Bbirth_place%3A".$args{birth_place}."%20%2Bbirth_year%3A".$args{birth_year}."~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219");
# REST OF THE SCRIPT HERE. MANY LINES.
}
作为另一个示例,以下将起作用:
locations.txt = Benjamin,Schuvlein,Germany,1913,M,White