Featuretools 提供了处理分类变量的集成功能
variable_types={"product_id": ft.variable_types.Categorical} https://docs.featuretools.com/loading_data/using_entitysets.html
然而,这些应该是strings
或pandas.Category
类型以实现与 Featuretools 的最佳兼容性?
编辑
此外,是否需要手动指定所有列,如 https://github.com/Featuretools/predict-appointment-noshow/blob/master/Tutorial.ipynb或者它们是否会从拟合熊猫数据类型中自动推断出来
import featuretools.variable_types as vtypes
variable_types = {'gender': vtypes.Categorical,
'patient_id': vtypes.Categorical,
'age': vtypes.Ordinal,
'scholarship': vtypes.Boolean,
'hypertension': vtypes.Boolean,
'diabetes': vtypes.Boolean,
'alcoholism': vtypes.Boolean,
'handicap': vtypes.Boolean,
'no_show': vtypes.Boolean,
'sms_received': vtypes.Boolean}