2

我在一个文件中有 x 值和相应的计数。我将其读为以下形式的元组列表

dat = [(0.02, 1), 
(0.0211, 1), 
(0.021, 1), 
(0.023, 1), 
(0.0251, 1), 
(0.12, 2), 
(0.141, 1), 
(0.14, 3), 
(0.171, 1), 
(0.462, 9),
(0.467, 10),
(0.478, 15), 
(0.804, 20), 
(0.815, 31), 
(0.815, 24),
(2.72, 164), 
(2.78, 147), 
(2.8, 128),
(5.78, 6), 
(5.83, 1), 
(5.8603, 1),
(5.94, 17), 
(8.63, 3), 
(8.87, 5),  
(18.601, 1), 
(19.0, 7), 
(21.0, 2), 
(22.0, 4)]

如何将这些转换为相等的间隔计数。例如,增量为 0.2 的间隔。

x    count
0    0
0.5  12
1.0  75
1.5  0
2.0  0
2.5  0
3.0  439
... 
4

3 回答 3

3

熊猫的一种方法:

In [74]: df = pd.DataFrame.from_records(dat).set_index(0)

In [75]: counts = df.groupby(lambda x: floor(x / 0.5) * 0.5).count()

In [76]: counts
Out[76]: 
       1
0.0   12
0.5    3
2.5    3
5.5    4
8.5    2
18.5   1
19.0   1
21.0   1
22.0   1

您可以用零计数填充间隔:

In [77]: counts.reindex(np.arange(0, 22, 0.5)).fillna(0)
Out[73]: 
       1
0.0   12
0.5    3
1.0    0
1.5    0
2.0    0
2.5    3
3.0    0
3.5    0
4.0    0

etc ...
于 2012-11-22T18:27:04.330 回答
1

这是一个合理的解决方案,bin 上限存储在bins.

import numpy as np
min_bin_upper=0
max_bin_upper=100
bin_step=0.5

bins = np.arange(min_bin_upper,max_bin_upper,bin_step)
counts = np.zeros(len(bins))
i=0
for e in data:
    if e[0]>= bins[i]: i+=1
    if i>=len(bins): break
    counts[i]+=e[1]

print counts

我已经用它测试过

data = [(0.1, 3), (0.2, 1),(0.3, 10)]
min_bin_upper = 0
max_bin_upper = 1
bin_step = 0.2

它回来了

[  0.   3.  11.   0.   0.]

我希望这是你需要的。

于 2012-11-22T18:18:39.767 回答
0

这是一种使用 Python 标准库的方法:

import math

step_size = 0.5
result = {}
i = 0

for intval in [x * step_size for x in range(int(math.ceil(max(dat)[0]*2)+1))]:
    result[intval] = 0
    for n, count in dat[i:]:
        if n > intval:
            break
        result[intval] += count
        i += 1


print sorted(result.items(), key=lambda x:x[0])


[(0.0, 0), (0.5, 46), (1.0, 75), (1.5, 0), (2.0, 0), (2.5, 0), (3.0, 439), (3.5, 0)
, (4.0, 0), (4.5, 0), (5.0, 0), (5.5, 0), (6.0, 25), (6.5, 0), (7.0, 0), (7.5, 0),
(8.0, 0), (8.5, 0), (9.0, 8), (9.5, 0), (10.0, 0), (10.5, 0), (11.0, 0), (11.5, 0),
 (12.0, 0), (12.5, 0), (13.0, 0), (13.5, 0), (14.0, 0), (14.5, 0), (15.0, 0), (15.5
, 0), (16.0, 0), (16.5, 0), (17.0, 0), (17.5, 0), (18.0, 0), (18.5, 0), (19.0, 8),
(19.5, 0), (20.0, 0), (20.5, 0), (21.0, 2), (21.5, 0), (22.0, 4)]
于 2012-11-22T18:27:07.093 回答