我正在尝试逐列迭代 Python pandas 创建的数据框。虽然很容易让 Python 打印出一整列,但我根本不知道如何将这列数据转换为列表或字符串,以便我可以实际使用它包含的数据(在这种情况下,连接数据并复制到 FASTA 文件中)。我的代码如下。任何建议将不胜感激。
import sys
import string
import shlex
import numpy as np
import pandas as pd
SNP_df = pd.read_csv('SNPs.txt',sep='\t',index_col = None ,header = None, nrows = 101)
output = open('100 SNPs.fa','a')
i=1
for i in SNP_df[i]:
data = SNP_df[i]
data = shlex.shlex(data, posix = True)
data.whitespace += "\n"
data.whitespace_split = True
data = list(data)
for j in data:
if j == 0:
output.write(("\n>%s\n")%(str(data(j))))
else:
output.write(data(j))
Here are the first few lines of my data file: POSITION REF AR_DM1005 AR_DM1015 AR_DM1050 AR_DM1056 AR_DM1088 AR_KB635 AR_KB652 AR_KB754 AR_KB819 AR_KB820 AR_KB827 AR_KB945 AR_MSH126 AR_MSH51 PP_BdA1134-13 PP_BdA1137-10 PP_DM1038 PP_DM1049 PP_DM1054 PP_DM1065 PP_DM1081 PP_DM1084 PP_JR83 ST_JR138 ST_JR158 ST_JR209 ST_JR72 ST_JR84 ST_JR91 ST_MSH177 ST_MSH217 CH_JR198 CH_JR20 CH_JR272 CH_JR356 CH_JR377 CH_KB888 CH_MSH202 TL_MA1959 TL_MSH130 TL_SCI12-2 TL_SPE123_2-3 TL_SPE123_5-1 TL_SPE123_6-3 TL_SPE123_7-1 TL_SPE123_8-1 CUSP_SPE132_34_1-2
55 CTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTC
380 GGAAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAG GGGGGGGG
391
AAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
第422章 第422
章