我必须处理.txt
Folder.like 内子文件夹中存在的文件:
New Folder>Folder 1 to 6>xx.txt & yy.txt(files present in each folder)
每个文件包含两列:
arg his
asp gln
glu his
和
arg his
glu arg
arg his
glu asp
现在我要做的是:
1)计算每个文件的每个单词的出现次数>并通过除以2的平均总数total no. of lines in that file
)然后用完成第一步后获得的值,将这些值除以总数。文件夹中存在的用于平均的文件(即在这种情况下为 2)我已经尝试使用我的代码如下:
但我在第一种情况下成功但我没有得到第二种情况。
for root,dirs,files in os.walk(path):
aspCount = 0
glu_count = 0
lys_count = 0
arg_count = 0
his_count = 0
acid_count = 0
base_count = 0
count = 0
listOfFile = glob.iglob(os.path.join(root,'*.txt')
for filename in listOfFile:
lineCount = 0
asp_count_col1 = 0
asp_count_col2 = 0
glu_count_col1 = 0
glu_count_col2 = 0
lys_count_col1 = 0
lys_count_col2 = 0
arg_count_col1 = 0
arg_count_col2 = 0
his_count_col1 = 0
his_count_col2 = 0
count += 1
for line in map(str.split,inp):
saltCount += 1
k = line[4]
m = line[6]
if k == 'ASP':
asp_count_col1 += 1
elif m == 'ASP':
asp_count_col2 += 1
if k == 'GLU':
glu_count_col += 1
elif m == 'GLU':
glu_count_col2 += 1
if k == 'LYS':
lys_count_col1 += 1
elif m == 'LYS':
lys_count_col2 += 1
if k == 'ARG':
arg_count_col1 += 1
elif m == 'ARG':
arg_count_col2 += 1
if k == 'HIS':
his_count_col1 += 1
elif m == 'HIS':
his_count_col2 += 1
asp_count = (float(asp_count_col1 + asp_count_col2))/lineCount
glu_count = (float(glu_count_col1 + glu_count_col2))/lineCount
lys_count = (float(lys_count_col1 + lys_count_col2))/lineCount
arg_count = (float(arg_count_col1 + arg_count_col2))/lineCount
his_count = (float(his_count_col1 + his_count_col2))/lineCount
至此,我可以获得每个文件的平均值。但是我怎么能得到每个子文件夹的平均值(即除以计数(文件总数))。问题是第二部分。第一部分完成。提供的代码将平均每个文件的值。但是我想添加这个平均值并通过除以总数来得出一个新的平均值。子文件夹中存在的文件。