0

我有文件energy.txt

path   energy      counter
AXX    100.00          1
AXX     99.99          2
AXX     99.98          1
AXX     99.50          1
AXX     99.00          7

我想比较第二列的值,如果它们之间的差异小于 0.02,则保留第二个值并添加一个计数器。

例如,第一步是 100.00 - 99.99 = 0.01(小于 0.02)所以

path   energy      counter
AXX     99.99          3   
AXX     99.98          1
AXX     99.50          1
AXX     99.00          7 

第二个:99.99 - 99.98 = 0.01,所以

path   energy      counter
AXX     99.98          4
AXX     99.50          1
AXX     99.00          7 

第三个:99.98 - 99.50 = 0.48(大于0.02)

第四个:99.50 - 99.00 = 0.50(大于 0.02)。

我想在 Python 中做到这一点。

4

2 回答 2

1

熊猫风格:

import pandas as pd

df = pd.read_table(filename, sep='\s+')

# generate a value (label) with which we can group rows together
label = (df['energy'].diff() < -0.02).astype('int')
df['label'] = label.cumsum()
print(df)
#   path  energy  counter  label
# 0  AXX  100.00        1      0
# 1  AXX   99.99        2      0
# 2  AXX   99.98        1      0
# 3  AXX   99.50        1      1
# 4  AXX   99.00        7      2

# Aggregate the count for each label group
grouped = df.groupby(['label'])
counts = grouped[['counter']].agg('sum')
print(counts)
#        counter
# label         
# 0            4
# 1            1
# 2            7

# Find the index of the row with the minimum energy per group
idx = grouped['energy'].agg(lambda col: col.idxmin())

# Select only those rows from df
result = df.ix[idx, ['path', 'energy', 'label']]

# Merge in the computed counts
result = pd.merge(result, counts, left_on=['label'], right_index=True)
result = result.ix[:, ['path','energy','counter']]
print(result)

产量

  path  energy  counter
2  AXX   99.98        4
3  AXX   99.50        1
4  AXX   99.00        7
于 2013-04-27T19:25:35.583 回答
0

像这样,使用字典:

with open("abc") as f:
    next(f)
    dic={}
    counter=1
    dic[counter]=map(float,next(f).split()[1:])
    for line in f:
        curr=map(float,line.split()[1:])
        if abs(dic[counter][0]-curr[0])<0.02:
            dic[counter]=[curr[0], dic[counter][1]+curr[1]]
        else:
            counter+=1
            dic[counter]=curr

for key in sorted(dic):
    print dic[key]

输出:

[99.98, 4.0]
[99.5, 1.0]
[99.0, 7.0]
于 2013-04-27T18:47:56.693 回答