unix - 如果指定列中包含单词，则提取一行

Question

如果它在文本文件的指定列中包含一个单词，我想提取一行。我如何在单行 unix 命令上做到这一点？也许有cat, echo, cut,grep有几个 piples 什么的。

我有一个使用这种格式的文本文件

#SentenceID<tab>Sentence1<tab>Sentence2<tab>Other_unknown_number_of_columns<tab> ...

文本文件的示例如下所示：

021348  this is the english sentence with coach .   c'est la phrase française avec l'entraîneur .   And then there are several nonsense columns like these  .
923458  this is a another english sentence without the word .   c'est une phrase d'une autre anglais sans le bus mot .  whatever foo bar    nonsense columns    2134234 $%^&

如果我要查找的单词coach在第二列，则该命令应输出：

021348  this is the english sentence with coach .   c'est la phrase française avec l'entraîneur .   And then there are several nonsense columns like these  .

我可以用 python 来做到这一点，但我正在寻找一个 unix 命令或单行的东西：

outfile = open('out.txt')
for line in open('in.txt'):
  if "coach" in line.split():
    print>>outfile, line

score 5 · Accepted Answer

那这个呢？

awk -F'\t' '{if($2 ~ "coach") print} your_file

-F'\t'--> 使分隔符成为制表符。
$2 ~ "coach"--> 在第二个字段中寻找“教练”。
print $0或print--> 打印整行。

编辑

sudo_O提出了以下建议，甚至更短：

awk -F'\t' '$2~/coach/' file

score 1 · Accepted Answer

对于这种需求，我总是使用 awk ：

awk -F'\t' '$2 ~ /coach/ {print $0;}' < textFile

您可以使用 $x 访问所有列，$0 包含整行。测试是用正则表达式进行的，在这种情况下非常简单，所以如果你的需求变得更复杂，它真的很强大。

unix - 如果指定列中包含单词，则提取一行

2 回答 2

Related

Reference