linux - 如何grep模式后的内容？

Question

给定一个文件，例如：

potato: 1234
apple: 5678
potato: 5432
grape: 4567
banana: 5432
sushi: 56789

我想对所有以开头的行进行 grep 处理，potato:但只使用管道后面的数字potato:。所以在上面的例子中，输出将是：

1234
5432

我怎样才能做到这一点？

score 138 · Accepted Answer

grep 'potato:' file.txt | sed 's/^.*: //'

grep查找包含 string 的任何行potato:，然后，对于这些行中的每一行，sed将s///任何字符 ( .*) 从行 ( ^) 的开头直到序列的最后一次出现:（冒号后跟空格）替换为空字符串（s/...//- 用第二部分替换第一部分，第二部分为空）。

或者

grep 'potato:' file.txt | cut -d\   -f2

对于包含potato:,的每一行，cut将把该行拆分为由空格分隔的多个字段（-d\- d= 分隔符，\= 转义的空格字符，类似的东西-d" "也可以使用）并打印每个这样的行的第二个字段 ( -f2)。

或者

grep 'potato:' file.txt | awk '{print $2}'

对于包含的每一行potato:，awk将打印print $2默认由空格分隔的第二个字段 ( )。

或者

grep 'potato:' file.txt | perl -e 'for(<>){s/^.*: //;print}'

包含的所有行都potato:被发送到一个内联 ( -e) Perl脚本，该脚本从中获取所有行stdin，然后，对于这些行中的每一行，执行与上面第一个示例中相同的替换，然后打印它。

或者

awk '{if(/potato:/) print $2}' < file.txt

文件通过stdin(< file.txt将文件的内容发送stdin到左侧的命令) 发送到awk脚本，对于包含potato:(if(/potato:/)如果正则表达式/potato:/匹配当前行，则返回 true ) 的每一行，打印第二个字段，如所述多于。

或者

perl -e 'for(<>){/potato:/ && s/^.*: // && print}' < file.txt

该文件通过stdin( < file.txt，见上文) 发送到与上述类似的 Perl 脚本，但这次它还确保每一行都包含字符串potato:(/potato:/是一个正则表达式，如果当前行包含potato:，并且如果它执行 ( &&)，然后继续应用上述正则表达式并打印结果）。

score 72 · Accepted Answer

72

或使用正则表达式断言：grep -oP '(?<=potato: ).*' file.txt

于 2012-04-27T22:59:07.293 回答

score 13 · Accepted Answer

sed -n 's/^potato:[[:space:]]*//p' file.txt

可以将 Grep 视为受限 Sed，或将 Sed 视为广义 Grep。在这种情况下，Sed 是一个很好的轻量级工具，可以满足您的需求——当然，还有其他几种合理的方法可以做到这一点。

score 13 · Accepted Answer

grep -Po 'potato:\s\K.*' file

-P使用 Perl 正则表达式

-o只输出匹配

\s匹配之后的空间potato:

\K省略比赛

.*匹配其余字符串

score 2 · Accepted Answer

这将在每场比赛后打印所有内容，仅在同一行上：

perl -lne 'print $1 if /^potato:\s*(.*)/' file.txt

这将做同样的事情，除了它还将打印所有后续行：

perl -lne 'if ($found){print} elsif (/^potato:\s*(.*)/){print $1; $found++}' file.txt

使用这些命令行选项：

score 1 · Accepted Answer

正如其他答案所述，您可以使用 grep。但您不需要 grep、awk、sed、perl、cut 或任何外部工具。你可以用纯 bash 来做到这一点。

试试这个（分号可以让你把它全部放在一行上）：

$ while read line;
  do
    if [[ "${line%%:\ *}" == "potato" ]];
    then
      echo ${line##*:\ };
    fi;
  done< file.txt

## 告诉 bash 从前面删除 $line 中最长的 ":" 匹配。

$ while read line; do echo ${line##*:\ }; done< file.txt
1234
5678
5432
4567
5432
56789

或者，如果您想要键而不是值， %% 告诉 bash 从末尾删除 $line 中“：”的最长匹配。

$ while read line; do echo ${line%%:\ *}; done< file.txt
potato
apple
potato
grape
banana
sushi

要拆分的子字符串是“:\”，因为空格字符必须用反斜杠转义。

您可以在linux 文档项目中找到更多类似的内容。

score 1 · Accepted Answer

现代 BASH 支持正则表达式：

while read -r line; do
  if [[ $line =~ ^potato:\ ([0-9]+) ]]; then
    echo "${BASH_REMATCH[1]}"
  fi
done

score -1 · Accepted Answer

-1

grep potato file | grep -o "[0-9].*"

于 2021-10-28T13:32:54.827 回答

8 回答 8