0

我正在为这项任务苦苦挣扎:
到目前为止我做了什么:我有 8760 个值,我根据一定的时间间隔将它们分类。间隔数为 10。然后我将值分组。

问题:现在我必须将此数据帧(df1)的每个“级别”引用到(df2)中另一个数据帧的索引,以逐行执行某个计算。(即)10个间隔指向另一个数据帧的10个索引。

bins=[-1,0,1,1.065,1.230,1.500,1.950,2.800,4.500,6.200,13.10]
arr=pd.cut(df1,bins)
grouped=df1.groupby(arr)
pd.value_counts(arr)


Out[58]:
(-1, 0]           4015  
(0, 1]            1948  
(1.95, 2.8]       646  
(2.8, 4.5]        542  
(1.5, 1.95]       539  
(1.23, 1.5]       427  
(1.065, 1.23]     337  
(4.5, 6.2]        127  
(1, 1.065]        125  
(6.2, 13.1]        54  
dtype: int64  

现在我必须使用它来引用 (df2) 的索引

data={'f11':['0','0','-0.008','0.13','0.33','0.568','0.873','1.132','1.06','0.678'],'f12':['0','0','0.588','0.683','0.487','0.187','-0.392','-1.237','-1.6','-0.327'],'f13':['0','0','-0.062','-0.151','-0.221','-0.295','-0.362','-0.412','-0.359','-0.25'],'f21':['0','0','-0.06','-0.019','0.055','0.109','0.226','0.288','0.264','0.156'],'f22':['0','0','0.072','0.066','-0.064','-0.152','-0.462','-0.823','-1.127','-1.377'],'f23':['0','0','-0.022','-0.029','-0.026','-0.014','0.001','0.056','0.131','0.251']}  

df2=DataFrame(data,columns=['f11','f12','f13','f21','f22','f23'],index=['1','2','3','4','5','6','7','8','9','10'])

需要的解决方案: (-1, 0] 引用索引 '1',(0, 1] 引用索引 '2' 等等。这是对所有 8760 执行 (f11+f12+(f21*f22*f23))根据引用的索引逐行取值。

4

1 回答 1

0
  1. Map categories into integer indexes

    mapping_dict = dict(zip(arr.unique(), np.arange(arr.size)))

    category_as_int = pd.Series(arr).map(mapping_dict)

  2. Add category_as_int as a column to df1

    df1 = pd.DataFrame(df1) #Converts df1 to DataFrame if its a Series

    df1['key'] = category_as_int

  3. Merge df1 and df2 (Note change in index for df2)

    df2 = DataFrame(data, columns=['f11','f12','f13','f21','f22','f23'], index=np.arange(len(data))

    df = pd.merge(df1, df2, left_on='key', right_index=True, how='left')

  4. Perform operation on all 8K+ rows

    df.f11 + df.f12 + (df.f21 * df.f22 * df.f23)

于 2014-02-25T16:01:52.690 回答