awk - Bash如何使用awk在空行上拆分文件

Question

我有一个文本文件 ( A.in)，我想将其拆分为多个文件。每次找到空行时都应该进行拆分。文件名应该是渐进式的（A1.in, A2.in, ..）

我发现这个建议使用的答案awk，但我无法使用我想要的命名约定

awk -v RS="" '{print $0 > $1".txt"}' file

我还找到了其他答案告诉我使用该命令csplit -l，但我无法使其匹配空行，我尝试匹配模式''但我对正则表达式不太熟悉，我得到以下信息

bash-3.2$ csplit A.in ""
csplit: : unrecognised pattern

输入文件：

输入

4 
RURDDD

6
RRULDD
KKKKKK

26
RRRULU

期望的输出：

A1.in

4 
RURDDD

A2.in

6
RRULDD
KKKKKK

A3.in

26
RRRULU

score 4 · Accepted Answer

awk 的另一个修复：

$ awk -v RS="" '{
    split(FILENAME,a,".")  # separate name and extension
    f=a[1] NR "." a[2]     # form the filename, use NR as number
    print > f              # output to file
    close(f)               # in case there are MANY to avoid running out f fds
}' A.in

score 2 · Accepted Answer

In any normal case, the following script should work:

awk 'BEGIN{RS=""}{ print > ("A" NR ".in") }' file

The reason why this might fail is most likely due to some CRLF terminations (See here and here).

As mentioned by James, making it a bit more robust as:

awk 'BEGIN{RS=""}{ f = "A" NR ".in"; print > f; close(f) }' file

If you want to use csplit, the following will do the trick:

csplit --suppress-matched  -f "A" -b "%0.2d.in" A.in '/^$/' '{*}'

See man csplit for understanding the above.

score 0 · Accepted Answer

输入文件内容：

$ cat A.in 
4 
RURDDD

6
RRULDD
KKKKKK

26
RRRULU

AWK 文件内容：

BEGIN{
    n=1
}
{
    if(NF!=0){
        print $0 >> "A"n".in"
    }else{
        n++
    }
}

执行：

awk -f ctrl.awk A.in

输出：

$ cat A1.in 
4 
RURDDD

$ cat A2.in 
6
RRULDD
KKKKKK

$ cat A3.in 
26
RRRULU

PS：没有AWK文件的单行执行：

awk 'BEGIN{n=1}{if(NF!=0){print $0 >> "A"n".in"}else{n++}}' A.in

awk - Bash如何使用awk在空行上拆分文件

3 回答 3

Related

Reference