修正案:
如果我有一个包含 5 列& Col1
& Col2
& Col3
&Col4
的pandas DataFrame ,Col5
我需要通过考虑Col2
Col3
Col2
Col4
Col2
Col5
Col1
Col2
下式得到的修改值:
df['Col1']=np.power((df['Col1']),B)
df['Col2']=df['Col2']*df['Col1']
其中 B
是变化的变量(单个值),以获得(新值Col2
, Col3
)&(新值Col2
, Col4
)和(新值Col2
, )之间的最大 Pearson 相关系数Col5
。
更新:
Col2
上表包含我上面提到的 5 列,( , Col3
) & ( Col2
, Col4
) & ( Col2
, )之间的系数之间的相关性Col5
如下表所示。
我需要Col2
根据两个提到的方程式更改 的值,其中更改值为B
。
所以问题是如何获得最好的值B
,使新的相关系数大于或等于其对应物(旧)?
更新 2:
Col1,Col2,Col3,Col4,Col5
2,0.051361397,2618,1453,1099
4,0.053507779,306,153,150
2,0.041236151,39,54,34
6,0.094526419,2755,2209,1947
4,0.079773397,2313,1261,1022
4,0.083891415,3528,2502,2029
6,0.090737243,3594,2781,2508
2,0.069552772,370,234,246
2,0.052401789,690,402,280
2,0.039930675,1218,846,631
4,0.065952096,1706,523,453
2,0.053064126,314,197,123
6,0.076847486,4019,1675,1452
2,0.044881545,604,402,356
2,0.073102611,2214,1263,1050
0,0.046998526,938,648,572