下面是一个现有的df
data = np.array([['','Market','Product Code','Week','Sales','Units'],
['Total Customers',123,1,500,400],
['Total Customers',123,2,400,320],
['Major Customer 1',123,1,100,220],
['Major Customer 1',123,2,230,230],
['Major Customer 2',123,1,130,30],
['Major Customer 2',123,2,20,10],
['Total Customers',456,1,500,400],
['Total Customers',456,2,400,320],
['Major Customer 1',456,1,100,220],
['Major Customer 1',456,2,230,230],
['Major Customer 2',456,1,130,30],
['Major Customer 2',456,2,20,10]])
df =pd.DataFrame(data)
我希望根据“市场”列(总客户)中的行值与“市场”列(主要客户 1 + 主要客户 2)中的行值之间的值差异创建新行。我希望将“市场”列中的新行值分配为“剩余客户”并附加在同一个 df 中。
总的来说,我基本上是在尝试计算市场剩余的销售额和单位“差距”
这是我迄今为止使用 loc 尝试过的,但我不断收到一个关键错误。任何人都可以帮忙吗?
df.loc[df['Market'] == 'Remaining Customers'] =
df.loc[df['Market'] == 'Total Customers']-
(df.loc[df['Market'] == 'Major Customer 1']+df.loc[df['Market'] == 'Major Customer 2'])