我有一个这样的数据框,我想对列“角色”进行过采样(在实际情况下,行/列的数量比这个最小的例子大得多)
role value
pop_13vdpn1_site_1 1 1
pop_13vdpn1_site_1 1 1
pop_13vdpn1_site_1 1 2
pop_13vdpn1_site_1 1 1
pop_13vdpn1_site_1 1 1
pop_13vdpn1_site_1 1 2
pop_13vdpn1_site_1 1 1
pop_13vdpn1_site_1 2 1
pop_13vdpn1_site_1 2 1
pop_13vdpn1_site_1 2 1
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_2 2 2
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_2 2 1
pop_13vdpn1_site_3 2 1
[...........]
Index: 20 entries, pop_13vdpn1_site_1 to pop_13vdpn1_site_1
Data columns (total 2 columns):
role 20 non-null int64
value 20 non-null int64
这就是我正在做的事情:
X,y = smote.fit_sample(df,df[['role']])
X
role value
0 1 1
1 1 1
2 1 2
3 1 1
4 1 1
5 1 2
6 1 1
7 2 1
8 2 1
[.........]
它可以工作,但问题是我需要保留索引(pop_13vdpn1_site_1 等)这可能吗?