python - 如何通过对每组 N 个顺序元素执行操作来减少熊猫系列

Question

假设我有一个熊猫系列，我想取每组 8 行的平均值。我对系列的大小没有先验知识，索引可能不是从 0 开始的。我目前有以下

N = 8

s = pd.Series(np.random.random(50 * N))

n_sets = s.shape[0] // N

split = ([m * N for m in range(n_sets)],
         [m * N for m in range(1, n_sets + 1)])

out_array = np.zeros(n_sets)

for i, (a, b) in enumerate(zip(*split)):

    out_array[i] = s.loc[s.index[a:b]].mean()

有没有更短的方法来做到这一点？

score 1 · Accepted Answer

您可以尝试使用groupby, 通过将索引切片N（您可以在此处查看切片的解释），然后使用pd.Series.mean()：

newout_array=s.groupby(s.index//N).mean().to_list()

输出：

out_array  #original solution
[0.42147899 0.55668055 0.5222594  0.46066426 0.44378491 0.52719371
 0.42479113 0.46485387 0.2800083  0.57174865 0.59207811 0.58665479
 0.52414851 0.38158931 0.51884761 0.59007469 0.3449512  0.56385373
 0.34359674 0.44524997 0.44175351 0.42339394 0.5687501  0.3140091
 0.40985639 0.46649486 0.3101396  0.45664647 0.51829052 0.38875796
 0.45428001 0.52979064 0.62545921 0.64782618 0.65265239 0.56976799
 0.64277369 0.33528876 0.45973874 0.45341751 0.52690983 0.66427599
 0.59814577 0.35575622 0.62995929 0.61582329 0.38971679 0.4771326
 0.50889137 0.25105353]


newout_array  #new solution

[0.4214789945860148, 0.5566805507021909, 0.5222593998859411, 0.46066425607167216, 0.4437849132421554, 0.5271937114894408,
 0.424791134573943, 0.4648538659945887, 0.28000829556024387, 0.5717486453029332, 0.5920781058695997, 0.5866547941460012, 
 0.5241485100329547, 0.38158931177460725, 0.5188476113762392, 0.5900746905953183, 0.34495119855714756, 0.5638537286251522, 
 0.3435967359945349, 0.44524997190104454, 0.44175351484451975, 0.42339393886425913, 0.5687501027416468, 0.3140090963728155, 
 0.40985639015924036, 0.4664948621046134, 0.3101396034068746, 0.45664647332866076, 0.5182905157666298, 0.38875796468438406, 
 0.4542800111275337, 0.5297906368971982, 0.6254592119278896, 0.6478261817988752, 0.6526523935382951, 0.569767994485338, 
 0.642773691835847, 0.3352887578683835, 0.45973873832126594, 0.45341751320112617, 0.5269098312525405, 0.6642759923683706, 
 0.5981457683986061, 0.3557562229383897, 0.6299592930489117, 0.6158232897272005, 0.38971678834383916, 0.4771325988592886, 
 0.5088913710936904, 0.25105352820427246]

不同之处在于每种格式的小数位数，如果您只想保留8位小数作为原始格式，out_array您可以尝试map使用具有功能的元素round：

newout_array=s.groupby(s.index//N).mean().to_list()
newout_array=list(map(lambda x: round(x,8),newout_array))

python - 如何通过对每组 N 个顺序元素执行操作来减少熊猫系列

1 回答 1

Related

Reference