我有一个大文件(2GB),看起来像这样:
>10GS_A
YTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGD
LTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKD
DYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFP
LLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ
>11BA_A
KESAAAKFERQHMDSGNSPSSSSNYCNLMMCCRKMTQGKCKPVNTFVHESLADVKAV
CSQKKVTCKNGQTNCYQSKSTMRITDCRETGSSKYPNCAYKTTQVEKHIIVACGGKP
SVPVHFDASV
>11BG_A
KESAAKFERQHMDSGNSPSSSSNYCNLMMCCRKMTQGKCKPVNTFVHESLADVKAVCSQKKVT
CKNGQTNCYQSKSTMRITDCRETGSSKYPNCAYKTTQVEKHIIVACGGKPSVPVHFDASV
>121P_A
MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRD
QYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLAARTVESRQAQDLARSYG
IPYIETSAKTRQGVEDAFYTLVREIRQH
我想将此文件拆分为基于分隔符“>”的较小文件,在这种情况下,生成了 4 个包含以下文本并以下列方式命名的文件:
10gs_A.txt
11ba_A.txt
11bg_A.txt
121p_A.txt
并且它们包含以下内容:10gs_A.txt
>10GS_A
YTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGD
LTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKD
DYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFP
LLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ
11ba_A.txt
>11BA_A
KESAAAKFERQHMDSGNSPSSSSNYCNLMMCCRKMTQGKCKPVNTFVHESLADVKAV
CSQKKVTCKNGQTNCYQSKSTMRITDCRETGSSKYPNCAYKTTQVEKHIIVACGGKP
SVPVHFDASV
... 等等。我知道在 linux 中使用 split 命令分隔较大的文本文件,但是它将创建的文件命名为 temp00、temp01、temp03。有没有办法拆分这个较大的文件并根据我的需要命名文件?实现此目的的拆分功能是什么?