我有一个文件,其行是这样的:
EF457507|S000834932 Root;Bacteria;"Acidobacteria";Acidobacteria_Gp4;Gp4
EF457374|S000834799 Root;Bacteria;"Acidobacteria";Acidobacteria_Gp14;Gp14
AJ133184|S000323093 Root;Bacteria;Cyanobacteria/Chloroplast;Cyanobacteria;Family I;GpI
DQ490004|S000686022 Root;Bacteria;"Armatimonadetes";Armatimonadetes_gp7
AF268998|S000340459 Root;Bacteria;TM7;TM7_genera_incertae_sedis
我想打印第一个标签和最后一个分号之间的任何东西,就像那样
EF457507|S000834932 Gp4
EF457374|S000834799 Gp14
AJ133184|S000323093 GpI
DQ490004|S000686022 Armatimonadetes_gp7
AF268998|S000340459 TM7_genera_incertae_sedis
我尝试使用正则表达式,但它不起作用,有没有办法使用 Linux、awk 或 Perl 来做到这一点?