0

我正在使用 Pandas 的 qcut 为机器学习算法正确准备数据。我有带有价格的产品,我使用以下代码将我的数据离散化为大小相等的桶:

df['PriceBucket'] = pd.qcut(df['sell_prix'].sort_values(), 10, labels=False)

这个代码有更多关于我的标签的细节:

df['PriceBucketTitle'] = pd.qcut(df['sell_prix'].sort_values(), 10)

如下所示,我有 PriceBucket 和 PriceBucketTitle,它很完美!现在,我想要考虑到元素的数量。此代码返回 NaN 值(如下所示):

df['products_by_number'] = pd.qcut(df['sell_prix'], 10, labels=False).value_counts()

我知道如果我通过 PriceBucket 做一个 grouby 可能是可行的,但我想保留我的数据格式。这是结果:

      sell_prix PriceBucket PriceBucketTitle    products_by_number
4668    8.0          2         (6.5, 8.5]            NaN
4669    8.0          2         (6.5, 8.5]            NaN
4670    8.0          2         (6.5, 8.5]            NaN
4671    8.0          2         (6.5, 8.5]            NaN
4672    8.0          2         (6.5, 8.5]            NaN
4673    8.0          2         (6.5, 8.5]            NaN
4674    8.0          2         (6.5, 8.5]            NaN
4675    8.0          2         (6.5, 8.5]            NaN
4676    8.0          2         (6.5, 8.5]            NaN
4677    8.0          2         (6.5, 8.5]            NaN
11902   15.0         5         (12.9, 15]            NaN
11903   15.0         5         (12.9, 15]            NaN
11904   15.0         5         (12.9, 15]            NaN
11905   15.0         5         (12.9, 15]            NaN
11906   15.0         5         (12.9, 15]            NaN
11907   15.0         5         (12.9, 15]            NaN
11908   15.0         5         (12.9, 15]            NaN
11909   15.0         5         (12.9, 15]            NaN
11910   15.0         5         (12.9, 15]            NaN
11911   15.0         5         (12.9, 15]            NaN
12065   11.0         4         (10, 12.9]            NaN
12066   11.0         4         (10, 12.9]            NaN

例如,这就是我想要的:

      sell_prix PriceBucket PriceBucketTitle    products_by_number
4668    8.0          2         (6.5, 8.5]            984546.0
4669    8.0          2         (6.5, 8.5]            984546.0
4670    8.0          2         (6.5, 8.5]            984546.0
4671    8.0          2         (6.5, 8.5]            984546.0
4672    8.0          2         (6.5, 8.5]            984546.0
4673    8.0          2         (6.5, 8.5]            984546.0
4674    8.0          2         (6.5, 8.5]            984546.0
4675    8.0          2         (6.5, 8.5]            984546.0
4676    8.0          2         (6.5, 8.5]            984546.0
4677    8.0          2         (6.5, 8.5]            984546.0
11902   15.0         5         (12.9, 15]            1028141.0
11903   15.0         5         (12.9, 15]            1028141.0
11904   15.0         5         (12.9, 15]            1028141.0
11905   15.0         5         (12.9, 15]            1028141.0
11906   15.0         5         (12.9, 15]            1028141.0
11907   15.0         5         (12.9, 15]            1028141.0
11908   15.0         5         (12.9, 15]            1028141.0
11909   15.0         5         (12.9, 15]            1028141.0
11910   15.0         5         (12.9, 15]            1028141.0
11911   15.0         5         (12.9, 15]            1028141.0
12065   11.0         4         (10, 12.9]            48998.0
12066   11.0         4         (10, 12.9]            48998.0

帮助 ?谢谢!

4

0 回答 0