linux - 使用 awk 解析列

Question

我是 AWK 编程的新手，我想知道如何过滤以下文本：

Goedel - Declarative language for AI, based on many-sorted logic.  Strongly
typed, polymorphic, declarative, with a module system.  Supports bignums
and sets.  "The Goedel Programming Language", P. M. Hill et al, MIT Press
1994, ISBN 0-262-08229-2.  Goedel 1.4 - partial implementation in SICStus
Prolog 2.1.
ftp://ftp.cs.bris.ac.uk/goedel
info: goedel@compsci.bristol.ac.uk

只是为了打印这个：

Goedel

我使用了以下句子，但它并没有按我的意愿工作：

awk -F " - " "/ - /{ print $1 }"

它显示以下内容：

Goedel
1994, ISBN 0-262-08229-2.  Goedel 1.4

有人可以告诉我我必须修改什么才能得到我想要的吗？

提前致谢

score 0 · Accepted Answer

awk -F ' - ' ' { if (FNR % 4 == 1) next; print $1; }'

如果格式与下面的完全相同，那么上面的代码应该可以工作：

1 Author - ...
2 Year ...
3 URL
4 Extra info ...
5 Author - ...
6..N etc.

如果条目之间有空行，您可以设置RS为空字符串，$1只要 -F（awk 脚本中的 FS 变量）的值相同，您就可以成为作者。这样做的好处是，如果您没有“信息：...”或 URL，您仍然可以区分条目，假设它不是“作者 - ...{newline}Year ...{newline}{ newline}info: ...{newline}{newline}Author - ..."（如果空行是条目之间的分隔符，则条目的各个部分之间不能有空行。）例如：

# A blank line is what separates each entry.
BEGIN { RS = ""; }

{ print $1; }

如果你有一个支持它的awk，你可以在必要时使RS 成为一个多字符串（例如RS = "\n--\n"，在一行上用“--”分隔的条目）。如果您需要一个正则表达式或根本没有支持多个字符记录分隔符的 awk，则您必须使用类似以下的内容：

BEGIN { found_sep = 1; }

{ if (found_sep) { print $1; found_sep = 0; } }

# Entry separator is "--\n"
/^--$/ { found_sep = 1; }

更复杂的事情需要更多的样本输入。

score 0 · Accepted Answer

awk 'BEGIN { RS = "" } { print $1 }' your_file.txt

这意味着：通过空行将字符串拆分为段落，然后通过默认分隔符（空格）拆分单词，最后打印每个段落的第一个单词（$1）

score 0 · Accepted Answer

0

这个单线可以满足您的要求：

awk -F ' - ' 'NF>1{print $1;exit}'

于 2013-08-12T08:42:57.900 回答

linux - 使用 awk 解析列

3 回答 3

Related

Reference