1

文件1:

hello
- dictionary definitions:
hi
hello
hallo
greetings
salutations
no more hello for you
-
world
- dictionary definitions:
universe
everything
the globe
the biggest tree
planet
cess pool of organic life
-

我需要将此(对于大量单词列表)格式化为术语到定义格式(每个术语一行)。怎么能做到这一点?没有一个词是相同的,只有上面看到的结构是相同的。结果文件看起来像这样:

hello    - dictionary definitions:    hi    hello    hallo    greetings    salutations    no more hello for you    -
world    - dictionary definitions:    universe    everything    the globe    the biggest tree    planet    cess pool of organic life    -

Awk/Sed/Grep/Cat 是通常的竞争者。

4

6 回答 6

3

谁说只有 Perl 才能优雅地做到这一点?:)

$ gawk -vRS="-\n" '{gsub(/\n/," ")}1' file
hello - dictionary definitions: hi hello hallo greetings salutations no more hello for you
world - dictionary definitions: universe everything the globe the biggest tree planet cess pool of organic life

或者

# gawk 'BEGIN{RS="-\n";FS="\n";OFS=" "}{$1=$1}1'  file
hello - dictionary definitions: hi hello hallo greetings salutations no more hello for you
world - dictionary definitions: universe everything the globe the biggest tree planet cess pool of organic life
于 2009-10-24T11:56:01.690 回答
2
awk 'BEGIN {FS="\n"; RS="-\n"}{for(i=1;i<=NF;i++) printf("%s   ",$i); if($1)print"-";}' dict.txt

输出:

hello   - dictionary definitions:   hi   hello   hallo   greetings   salutations   no more hello for you   -
world   - dictionary definitions:   universe   everything   the globe   the biggest tree   planet   cess pool of organic life   -
于 2009-10-24T11:05:40.323 回答
2

perl 单行:

perl -pe 'chomp;s/^-$/\n/;print " "' File1

 hello - dictionary definitions: hi hello hallo greetings salutations no more hello for you
 world - dictionary definitions: universe everything the globe the biggest tree planet cess pool of organic life 

这“类似于”您所需的输出。

于 2009-10-24T11:18:20.067 回答
1

不确定您将使用的脚本语言,这里的伪代码:

for each line
 if line is "-"
  create new line
 else
  append separator to previous line
  append line to previous line
 end if
end for loop
于 2009-10-24T10:48:03.170 回答
1

在一个单词总是 6 行的条件下试试这个

sed 'N;N;N;N;N;N;N;N;s/\n/ /g' test_3
于 2009-10-24T11:49:05.323 回答
1
sed -ne'1{x;d};/^-$/{g;s/\n/ /g;p;n;x;d};H'
awk -v'RS=\n-\n' '{gsub(/\n/," ")}1'
于 2009-10-24T18:04:41.707 回答