我有一个数据争论的问题,我不知道如何解决。我有一个数据框,其中一列上的行都向上移动,并且该列没有完全填充。我需要将行向下移动并填充 X 行,具体取决于其他列中有多少数据。
编辑:我改变了我显示数据的方式。之前我贴的是markdown表,容易让人误会。我为此感到抱歉。我正在处理的数据如下所示:
code IdGene Type COGgene PosLeft postRight Strand Function
1
1075082 CDS ROG0189 93 710 + NA
8
1075089 CDS COG0226 5632 6741 + [P] ABC-type phosphate transport system, periplasmic component
1075103 CDS NA 6796 7869 + NA
9
1075105 CDS NA 8075 8923 + NA
1075096 CDS ROG0189 8983 10149 + NA
1071820 CDS NA 10181 10723 + NA
10
1071880 CDS COG0642 10893 13316 + [T] Signal transduction histidine kinase
1072052 CDS COG2204 13288 14586 + [T] Response regulator containing CheY-like receiver, AAA-type
12
1075092 CDS NA 15525 16472 + NA
13
1075087 CDS NA 16655 17371 + NA
1074837 CDS NA 17383 17703 + NA
1071956 CDS NA 17710 18168 + NA
14
1071684 CDS NA 18251 18919 - NA
15
1075519 CDS ROG5478 19044 19334 + NA
27
1075067 CDS ROG8331 35989 36417 + NA
1075056 CDS COG2244 36478 38019 + [R] Membrane protein involved in the export
1075546 CDS COG1035 38016 39218 + [C] Coenzyme F420-reducing hydrogenase, beta subunit
1074004 CDS ROG1263 39215 40375 + NA
1075083 CDS COG1701 40406 40582 + [S] Uncharacterized protein conserved in archaea
1075068 CDS COG0463 40593 41537 + [M] Glycosyltransferases involved in cell wall biogenesis
1075064 CDS ROG2632 41534 42700 + NA
1075066 CDS COG0463 42724 43656 + [M] Glycosyltransferases involved in cell wall biogenesis
1075069 CDS COG1215 43671 44066 + [M] Glycosyltransferases, probably involved in cell wall
我需要把它变成这样:
code IdGene Type COGgene PosLeft postRight Strand Function
1 1075082 CDS ROG0189 93 710 + NA
8 1075089 CDS COG0226 5632 6741 + [P] ABC-type phosphate transport system, periplasmic component
8 1075103 CDS NA 6796 7869 + NA
9 1075105 CDS NA 8075 8923 + NA
9 1075096 CDS ROG0189 8983 10149 + NA
9 1071820 CDS NA 10181 10723 + NA
10 1071880 CDS COG0642 10893 13316 + [T] Signal transduction histidine kinase
10 1072052 CDS COG2204 13288 14586 + [T] Response regulator containing CheY-like receiver, AAA-type
12 1075092 CDS NA 15525 16472 + NA
13 1075087 CDS NA 16655 17371 + NA
13 1074837 CDS NA 17383 17703 + NA
13 1071956 CDS NA 17710 18168 + NA
14 1071684 CDS NA 18251 18919 - NA
15 1075519 CDS ROG5478 19044 19334 + NA
27 1075067 CDS ROG8331 35989 36417 + NA
27 1075056 CDS COG2244 36478 38019 + [R] Membrane protein involved in the export
27 1075546 CDS COG1035 38016 39218 + [C] Coenzyme F420-reducing hydrogenase, beta subunit
27 1074004 CDS ROG1263 39215 40375 + NA
27 1075083 CDS COG1701 40406 40582 + [S] Uncharacterized protein conserved in archaea
27 1075068 CDS COG0463 40593 41537 + [M] Glycosyltransferases involved in cell wall biogenesis
27 1075064 CDS ROG2632 41534 42700 + NA
27 1075066 CDS COG0463 42724 43656 + [M] Glycosyltransferases involved in cell wall biogenesis
27 1075069 CDS COG1215 43671 44066 + [M] Glycosyltransferases, probably involved in cell wall
关于如何解决这个问题的任何想法都会很棒。理想情况下在 R 中,但 awk 或其他也很好。