0

我想知道如何在 AWK 中过滤以下行:

DSL - 

  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analog
computer functions.  "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 for
the IBM 7090.  Sammet 1969, p.632.

FLIP - 

  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).

  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.

  3. Formal LIst Processor.  Early language for pattern-matching on LISP
structures.  Similar to CONVERT.  "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.

所以我可以得到这样的东西:

DSL

FLIP

我在 AWK 中使用以下句子:

BEGIN { RS = "\n\n\n" ;  FS = " - " } 

{ print $1 }

但我得到的只是这个:

DSL

提前致谢!

4

5 回答 5

1

假设格式是恒定的(第一个条目中没有空格):

if ($2=="-"){print $1}

编辑:但如果你有这样的条目:

Objective C -
...

你需要类似的东西:

if ($NF=="-"){$NF="";print}

awk 非常擅长解析格式可预测的平面文件。

于 2013-08-12T09:18:23.083 回答
1

看来您正在寻找仅包含两个单词的行,而第二个单词是-. 如果是这样,那么你可以写:

awk 'NF == 2 && $2 == "-" { print $1 }'

您可以进一步限定它以坚持$1从行首开始(没有前导空格):

awk '$0 !~ /^ / && NF == 2 && $2 == "-" { print $1 }'

这两个都产生包含给定数据的DSL行。FLIP

于 2013-08-12T10:08:04.573 回答
1

@JonathanLeffler 为您的特定问题提供了一个很好的 awk 答案,但如果您要经常处理具有该格式的文件,您可能需要考虑重新格式化它们以使记录由换行符分隔,每个列表项都在一行,例如:

$ cat file
DSL -

  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analog
computer functions.  "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 for
the IBM 7090.  Sammet 1969, p.632.

FLIP -

  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).

  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.

  3. Formal LIst Processor.  Early language for pattern-matching on LISP
structures.  Similar to CONVERT.  "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.

$ awk '!/^[[:space:]]*$/{printf "%s%s", (NF==2 && /-[[:space:]]*$/ ? rs rs : (/^ +[[:digit:]]+\./ ? rs : "")), $0; rs="\n"} END{print ""}' file
DSL -
  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analogcomputer functions.  "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 forthe IBM 7090.  Sammet 1969, p.632.

FLIP -
  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).
  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
  3. Formal LIst Processor.  Early language for pattern-matching on LISPstructures.  Similar to CONVERT.  "FLIP, A Format List Processor", W.Teitelman, Memo MAC-M-263, MIT 1966.

这样您就可以轻松处理输出以打印或做任何您想做的事情,例如

1)打印每个标题行加上第一个项目符号:

$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} {print $1,$2}'
DSL -
  1. Digital Simulation Language.  Extensions to FORTRAN to simulate analogcomputer functions.  "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966).  Version: DSL/90 forthe IBM 7090.  Sammet 1969, p.632.

FLIP -
  1. Early assembly language on G-15.  Listed in CACM 2(5):16 (May 1959).

2)打印标题行加上“FLIP”记录的第二个项目符号:

$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} /^FLIP -/{print $1,$3}'
FLIP -
  2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.

3)打印标题行加上该标题的项目符号数:

$ awk '...' file | awk 'BEGIN{RS=""; FS=OFS="\n"} {print $1 NF-1}'
DSL - 1
FLIP - 3

等等等等

于 2013-08-12T11:47:09.660 回答
0

如果您要跳过的所有行都以空格开头,这将起作用:

awk -F"-" '{if (substr($1,1,1)!=" ")print $1}'
于 2013-08-11T22:12:35.820 回答
0

一条简短的grep行可以为您完成:

grep -Po '.*(?= -\s*$)' file
于 2013-08-12T08:48:25.287 回答