我有这 3 个数据框:
df_train cortado:____________________
SK_ID_CURR TARGET NAME_CONTRACT_TYPE_Cash loans \
0 100002 1 1
1 100003 0 1
2 100004 0 0
3 100006 0 1
4 100007 0 1
NAME_CONTRACT_TYPE_Revolving loans CODE_GENDER_F CODE_GENDER_M
0 0 0 1
1 0 1 0
2 1 0 1
3 0 1 0
4 0 0 1
df_bureau cortado:____________________
SK_ID_CURR SK_ID_BUREAU CREDIT_ACTIVE_Active
0 100002 5714464 1
1 100002 5714465 1
2 215354 5714466 1
3 215354 5714467 1
4 215354 5714468 1
bureau_balance cortado 3:____________________
SK_ID_BUREAU MONTHS_BALANCE STATUS_C
0 5715448 0 1
1 5715448 -1 1
2 5715448 -2 1
3 5715448 -3 1
4 5715448 -4 1
这是我试图运行以进行特征合成的脚本:
entities = {
"train" : (df_train, "SK_ID_CURR"),
"bureau" : (df_bureau, "SK_ID_BUREAU"),
"bureau_balance" : (df_bureau_balance,"MONTHS_BALANCE", "STATUS", "SK_ID_BUREAU") ,
}
relationships = [
("bureau", "SK_ID_BUREAU", "bureau_balance", "SK_ID_BUREAU"),
("train", "SK_ID_CURR", "bureau", "SK_ID_CURR")
]
feature_matrix_customers, features_defs = ft.dfs(entities=entities,
relationships=relationships,
target_entity="train"
)
但是,无论我何时引入“STATUS”列,都会发生此错误:TypeError: 'str' object does not support item assignment
如果我不放置“状态”列,则可以使用几行数据框。当行数增加时(并且只有将 STATUS 作为键才能解决它),会发生另一个错误: AssertionError: Index is not unique on dataframe (Entity Bureau_balance)
提前致谢!!