0

我想知道如何完成以下任务:例如,我有一个包含以下内容的文件:

猴
驴
鸡
马

我想对它做一个 grep,所以grep "horse\|donkey\|chicken",这会给我:

驴
鸡
马

但是,我真正想要的是以下内容:

马
驴
鸡

所以,我希望它按照我的“正则表达式”的顺序。我检查了手册页,但找不到任何参数。这可能(使用grep)吗?

4

5 回答 5

2

尝试使用此解决方案perl。它可能以多种方式失败并且有严重的限制,例如不超过 9 个备选方案,或者|表达式中没有。$1这是因为脚本将每个单词括在括号中,并在、$2等中查找匹配项。

内容script.pl

#!/usr/bin/env perl

use warnings;
use strict;

my (%matches, %words);

die qq|Usage: perl $0 <input-file> <regular-expression-PCRE>\n| unless @ARGV == 2;

my $re = pop;

## Assign an ordered number for each subexpression.
do {
    my $i = 0;
    %words = map { ++$i => $_ } split /\|/, $re;
};

## Surround each subexpression between parentheses to be able to select them
## later with $1, $2, etc.
$re =~ s/^/(/;
$re =~ s/$/)/;
$re =~ s/\|/)|(/g;

$re = qr/$re/;

## Process each line of the input file.
while ( <> ) { 
    chomp;

    ## If it matches any of the alternatives, search for it in any of the
    ## grouped expressions (limited to 9).
    if ( m/$re/o ) { 
        for my $i ( 1 .. 9 ) { 
            if ( eval '$' . $i ) { 
                $matches{ $i }++;
            }   
        }   
    }   
}

## Print them sorted.
for my $key ( sort keys %matches ) { 
    printf qq|%s\n|, $words{ $key } for ( 1 .. $matches{ $key } );
}

假设infile有数据:

monkey
donkey
chicken
horse
dog
cat
chicken
horse

像这样运行它:

perl script.pl infile 'horse|donkey|chicken'

这会产生:

horse
horse
donkey
chicken
chicken
于 2013-02-20T17:43:27.347 回答
2

But grep will give you answers in order of appearance in the input. The order of the subexpressions in your regex has nothing to do with it. If you really want the answers in that order, you could grep the file three times:

for f in myfile
do
  grep horse $f
  grep donkey $f
  grep chicken $f
done
于 2013-02-20T17:00:47.743 回答
1

您也可以为此使用 awk。以下示例收集op数组中的匹配模式并在规则中按原始顺序输出它们END

模式有序的 grep.awk

BEGIN { split(patterns, p) }

{ 
  for(i=1; i<=length(p); i++)
    if($0 ~ p[i])
      op[p[i]] = $0
}

END {
  for(i=1; i<=length(p); i++)
    if(p[i] in op) 
      print op[p[i]]
}

像这样运行它:

awk -v patterns='horse chicken donkey' -f pattern-ordered-grep.awk infile

输出:

horse
chicken
donkey

请注意,如果您只想输出模式而不是匹配行,请将最后的代码行替换为print p[i].

于 2013-02-20T20:47:24.387 回答
1

只需创建一个你想要的字符串数组,当你找到每个字符串时,继续检查数组中的下一个元素:

$ cat tst.awk
BEGIN{ numStrings = split("horse donkey chicken",strings) }
$0 == strings[numFound+1] { numFound++ }
numFound == numStrings { print "Found them all!"; exit }

$ cat file2           
monkey
horse
donkey
chicken

$ awk -f tst.awk file2
Found them all!

$ cat file            
monkey
donkey
chicken
horse

$ awk -f tst.awk file
$
于 2013-02-21T16:46:52.500 回答
0

这个怎么样?

cat file1.txt | grep -e horse -e donkey -e chicken | sort -r
horse
donkey
chicken
于 2013-02-20T17:20:06.560 回答