我有一个这样的空格分隔文件:
First Second Third Forth
It is possible to
do this task
with regex but i
don't know how to
我的任务是捕获每一行的所有单词并从中构造一个哈希。
但这是我的问题:任何列中的字段都可能为空(例如,第 3 行、第 3 字段)。
每行中的单词在其开头或结尾处按列名称对齐。(列名是第一行中的单词,例如First Second Third Forth
)
在我的示例中,单词在列中对齐到左侧(或列名的开头),并在First Third Forth
列中对齐到右侧(或列名的末尾)Second
使用每行的哈希值,我必须创建如下格式的输出:
$hash{First} has Second-property $hash{Second}. It also has $hash{Third} and $hash{Forth}.
use File::Basename;
use locale;
open my $file, "<", $ARGV[0];
open my $file2,">>",fileparse($ARGV[0])."2.txt";
my @alls = <$file>;
sub Main{
my $first = shift @alls;
my $poses = First_And_Last($first);
my $curr_poses;
my $curr_hash;
#do{OutputLine($_->[0],$_->[1],$first)}for (@$poses);
my $result_array=[];
my @keys = qw(# Variable Type Len Format Informat Label);
for $word(@alls){
$curr_poses=First_And_Last($word);
undef ($curr_hash);
$curr_hash = Take_Words($poses, $word, $curr_poses);
push @{$result_array},$curr_hash; #AoH
}
#end of main
}
sub First_And_Last{
#First_And_Last($str)
my $str = shift;
my $begin;
my $end;
my $ref=[];
while ($str=~m/(([\S\.]\s?)+\b|#)/g){
$begin = pos($str) - length($1);
$end = pos($str);
push @{$ref},[$begin,$end];
}
return $ref;
}
sub Take_Words{
#Take_Words($poses, $line,$current)
my $outref = {};
my $ref = shift; #take the ref of offsets of words
my $line = shift;# and the next line in file
my $current = shift; # and this is the poses of current line
my @keys = qw(# Variable Type Len Format Informat Label);
do{$outref->{$_}=undef;}for(@keys);
my $ethalon; #for $ref
my $relativity; #for $current
my $key; #for key in $outref
my @ethalon = @{$ref};
$ethalon = shift @ethalon;
$relativity = shift @{$current};
$key = shift @keys;
while (defined($key) && defined($relativity)){
if ($ethalon->[0] == $relativity->[0] || $ethalon->[1] == $relativity->[1]){
$outref->{$key} = substr($line, $relativity->[0],$relativity->[1] - $relativity->[0]);
$relativity = shift @{$current};
}
$ethalon = shift @ethalon;
$key = shift @keys;
}
return $outref;
}