我有不同状态的客户重复,因为每个客户订阅/产品都有一行。我想new_status
为客户生成一个“取消”,每个订阅状态都必须一起“取消”。
我用了:
df['duplicated'] = df.groupby('customer', as_index=False)['customer'].cumcount()
分隔索引中的每个重复项以指示重复值
Customer | Status | new_status | duplicated
X |canceled| | 0
X |canceled| | 1
X |active | | 2
Y |canceled| | 0
A |canceled| | 0
A |canceled| | 1
B |active | | 0
B |canceled| | 1
因此,我想使用 .apply 和/或 .loc 来生成:
Customer | Status | new_status | duplicated
X |canceled| | 0
X |canceled| | 1
X |active | | 2
Y |canceled| | 0
A |canceled| canceled | 0
A |canceled| canceled | 1
B |active | | 0
B |canceled| | 1