问题标签 [featuretools]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

195 问题

0 投票

1 回答

129 浏览

featuretools - Select amount of past data when calculating features

I'm wondering if there is a way to automatically select the amount of past data when calculating features.

For example, I might want to predict when a customer is going to make their next purchase, so it would be good to know a count of purchases or average purchase price by different date cutoffs. e.g. Purchases in the last 12 months, last 3 months, 7 days etc.

What is the best way to approach this with featuretools?

featuretools

2018-06-21T10:11:21.803

0 投票

0 回答

138 浏览

featuretools - 如何修复错误功能工具包安装python

安装功能工具时出现此错误...为什么？

“NewConnectionError<'未能建立新连接：[Errno 11004] getaddrinfo failed',>

featuretools

2018-06-21T11:12:21.743

0 投票

2 回答

710 浏览

python - 在 Featuretools 错误 TypeError 中创建实体集：“str”对象不支持项目分配

我有这 3 个数据框：

这是我试图运行以进行特征合成的脚本：

但是，无论我何时引入“STATUS”列，都会发生此错误：TypeError: 'str' object does not support item assignment

如果我不放置“状态”列，则可以使用几行数据框。当行数增加时（并且只有将 STATUS 作为键才能解决它），会发生另一个错误： AssertionError: Index is not unique on dataframe (Entity Bureau_balance)

提前致谢！！

python featuretools

2018-06-25T16:42:37.740

0 投票

1 回答

109 浏览

featuretools - 新功能的 dtypes 是什么？

为什么使用 WEEKDAY、DayOfMonth、YEAR、MonthOfYear 等转换原语创建的新特征被创建为整数，即连续特征？它们不应该是分类特征吗？我的意思是在创建这些功能时，这些列的 dtype 不应该是“object”而不是“int”吗？

featuretools

2018-06-26T06:55:06.730

0 投票

0 回答

232 浏览

featuretools - 功能工具时间序列数据按月年分组

我有申请号、贷款金额的时间序列数据。如何使用功能工具包按申请数量和平均贷款金额分组，而不将月年关系添加回主要实体？

我已经使用 pandas 完成了这项工作，但我正在尝试探索 featuretools 包，并想知道它是否具有按类似功能分组的功能。

下面是熊猫版本的例子：我想使用功能工具来复制它。

featuretools

2018-07-02T16:00:46.933

0 投票

2 回答

448 浏览

python - 从 Python 特征工具中的特征工程中排除当前行

我正在为当前行生成历史特征featuretools。例如，会话期间最后一小时内进行的事务数。

包featuretools包含参数cutoff_time以排除cutoff_time及时出现的所有行。

我设置cutoff_time为time_index value - 1 second，因此我希望这些功能基于历史数据减去当前行。这允许包括来自历史行的响应变量。

问题是，当这个参数不等于变量时，我会在原始特征和生成特征中time_index得到一堆s。NaN

例子：

输出（摘录）：

列sessions.SUM(transactions.amount)应该 >= 0。原始特征session_id product_id amount也是NaN如此。

如果transactions_df['cutoff_time'] = transactions_df['transaction_time']（无时间增量），此代码有效，但包含当前行。

计算将从计算中排除当前行的聚合和转换的正确方法是什么？

python pandas datetime feature-extraction featuretools

2018-07-05T12:45:11.463

0 投票

1 回答

774 浏览

featuretools - Featuretools 从多列创建索引

我正在尝试使用 featuretools 中的entity_from_dataframe函数从数据框中创建一个实体。如果索引包含多于一列，有没有办法定义索引。我不确定是否需要列表、元组或其他数据结构。这是代码：

它会产生以下关于哈希性的错误

类型错误：不可散列类型：“列表”

featuretools

2018-07-06T00:36:14.697

0 投票

1 回答

209 浏览

featuretools - 深度特征合成和特征工具背后的算法细节？

为了正确使用，了解深度特征合成和特征工具的算法/数学基础非常重要。是否有论文、专利、与其他工具的比较？

featuretools

2018-07-08T18:26:20.153

0 投票

1 回答

319 浏览

featuretools - 在实体中找不到 es.normalize_entity 错误变量

我正在使用 featuretools 文档来学习实体集，并且当前收到KeyError: 'Variable: device not found in entity'以下代码的错误：

这是根据 URL - https://docs.featuretools.com/loading_data/using_entitysets.html

从 API es.normalise_entity 看来，该函数将创建索引为“session_id”的新实体“会话”，其余 3 个变量，但错误是：

C:\Users\s_belvi\AppData\Local\Continuum\Anaconda2\lib\site-packages\featuretools\entityset\entity.pyc in _get_variable(self, variable_id) 250 return v 251 --> 252 raise KeyError("Variable: %在实体中找不到 % (variable_id)) 253 254 @property

KeyError：'变量：在实体中找不到设备'

在使用 es.normalize_entity 之前，我们是否需要单独创建实体“会话”？看起来流程中的语法出现了问题，一些小错误..

featuretools

2018-07-29T15:27:59.563

0 投票

0 回答

133 浏览

python - sklearn 和功能工具集成？

是否有功能工具（或额外的 Python 包）可以将其与通用 ML 库（如 sklearn）集成？例如，最好测试一个特征的预测能力，如果它足够高，生成更多类似的特征（例如，使用相同的初始变量）。换句话说，生成新特征的过程可以由它们的预测能力来指导吗？

python scikit-learn featuretools

2018-08-06T21:31:54.220

1 2 3 4 5 6 7 8 9 10

问题标签 [featuretools]

Reference