unix - 根据列中的文本过滤行

Question

我有一个制表符分隔的文本文件，如下所示：

27  1   hom het:het    het,het,het,het
18  1   hom het:het    hom,het,het,het,het,het,het
29  1   hom het:het    hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom
13  1   hom het:het    het,het,het,het,het,het
21  1   hom het:het    hom,het,het,het,het,het,hom,het,hom,het,het,het,hom
25  1   hom het:het    het,hom,het,het,het
29  1   hom het:het    hom,hom,het,hom,het,het,hom,het,het,hom,het,hom,het,hom
18  1   hom het:het    het,het,het
19  1   hom het:het    het,het,hom,het,het,het,het,het,het,hom,het,het,hom,het

我想排除第 5 列中有“hom”的行。即输出应如下所示：

27  1   hom het:het    het,het,het,het
13  1   hom het:het    het,het,het,het,het,het
18  1   hom het:het    het,het,het

使用 unix 命令有什么帮助吗？

score 5 · Accepted Answer

awk 非常适合：

$ awk '$5!~/\<hom\>/' file
27  1   hom het:het    het,het,het,het
13  1   hom het:het    het,het,het,het,het,het
18  1   hom het:het    het,het,het

解释：

$5         # is the fifth column
!~         # negated regex match 
/          # start regex string
\<         # matches the empty string at the beginning of a word.
hom        # matches the literal string 'hom'
\>         # matches the empty string at the end of a word.
/          # end regex string

score 0 · Accepted Answer

这是尝试使用sed

sed -r '/(\S+\s+){4}[^\s]*\b(hom)\b/d' file

输出：

27  1   hom het:het    het,het,het,het
13  1   hom het:het    het,het,het,het,het,het
18  1   hom het:het    het,het,het

unix - 根据列中的文本过滤行

2 回答 2

Related

Reference