4

我有一个文件,其中包含单词之间的映射。我必须引用该文件并将这些单词替换为某些文件中的映射单词。例如,下面的文件具有映射的单词表

1.12.2.4               1
1.12.2.7               12
1.12.2.2               5
1.12.2.4               4
1.12.2.6               67
1.12.2.12              5

我将有许多包含这些关键词的文件(1.12.2.*)。我想搜索这些关键词并将这些词替换为从此文件中获取的相应映射。如何在 shell 中执行此操作。假设一个文件包含以下几行说

The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.

执行脚本后,数字“1.12.2.12”和“1.12.2.4”应替换为5和4(参考主文件)。谁能帮我吗?

4

3 回答 3

7

一种使用方式GNU awk

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub(i, array[i]) }1' master.txt file.txt

结果:

The Id of the customer is 5. He is from Grg.
The Name of the machine is ASB
The id is 4. He is from Psg.

要将输出保存到文件:

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub(i, array[i]) }1' master.txt file.txt > name_of_your_output_file.txt

解释:

FNR==NR { ... }   # FNR is the current record number, NR is the record number
                  # so FNR==NR simply means: "while we process the first file listed
                  # in this case it's "master.txt"
array[$1]=$2      # add column 1 to an array with a value of column 2
next              # go onto the next record

{                 # this could be written as: FNR!=NR
                  # so this means "while we process the second file listed..."
for (i in array)  # means "for every element/key in the array..."
gsub(i, array[i]) # perform a global substitution on each line replacing the key
                  # with it's value if found
}1                # this is shorthand for 'print'

添加单词边界使匹配更加严格:

awk 'FNR==NR { array[$1]=$2; next } { for (i in array) gsub("\\<"i"\\>", array[i]) }1' master.txt file.txt
于 2012-09-13T07:17:25.843 回答
6

你可以为你sed写一个sed脚本:

映射:

cat << EOF > mappings
1.12.2.4               1
1.12.2.7               12
1.12.2.2               5
1.12.2.4               4
1.12.2.6               67
1.12.2.12              5
EOF

输入文件:

cat << EOF > infile
The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.
EOF

根据映射生成脚本(GNU sed):

sed -r -e 's:([^ ]*) +(.*):s/\\b\1\\b/\2/g:' mappings

输出:

s/\b1.12.2.4\b/1/g
s/\b1.12.2.7\b/12/g
s/\b1.12.2.2\b/5/g
s/\b1.12.2.4\b/4/g
s/\b1.12.2.6\b/67/g
s/\b1.12.2.12\b/5/g

sed与另一个(GNU sed)一起评估:

sed -r -e 's:([^ ]*) +(.*):s/\\b\1\\b/\2/g:' mappings | sed -f - infile

输出:

The Id of the customer is 5. He is from Grg. 
The Name of the machine is ASB
The id is 1. He is from Psg.

请注意,映射被视为正则表达式,例如点 ( .) 可以表示任何字符,并且可能需要在映射文件中或在生成sed脚本时进行转义。

于 2012-09-13T09:24:46.940 回答
0

由于您没有提供任何示例,我想这就是您想要的:

输入文件

> cat temp
1.12.2.4  1
1.12.2.7  12
1.12.2.2  5
1.12.2.4  4
1.12.2.6  67
1.12.2.12  5

要替换的文件

> cat temp2
The Id of the customer is 1.12.2.12. He is from Grg. 
The Name of the machine is ASB
The id is 1.12.2.4. He is from Psg.

输出

> temp.pl
The Id of the customer is 5. He is from Grg. 
The Name of the machine is ASB
The id is 4. He is from Psg

>

下面是 perl 脚本。

#!/usr/bin/perl

use strict;
use warnings;

my %hsh=();

open (MYFILE, 'temp');
open (MYFILE2, 'temp2');

while (<MYFILE>) {
my@arr = split/\s+/;
$hsh{$arr[0]} = $arr[1];
}
my $flag;
while(<MYFILE2>)
{
$flag=0;
my $line=$_;
foreach my $key (keys %hsh)
{
   if($line=~/$key/)
   {
    $flag=1; 
    $line=~s/$key/$hsh{$key}/g;
    print $line;
   }
}
  if($flag!=1)
  {
  print $line;
  $flag=0;
  }
}
close(MYFILE);
close(MYFILE2);
于 2012-09-13T07:15:03.707 回答