0

下面是一个现有的df

           data = np.array([['','Market','Product Code','Week','Sales','Units'],
            ['Total Customers',123,1,500,400],
            ['Total Customers',123,2,400,320],
            ['Major Customer 1',123,1,100,220],
            ['Major Customer 1',123,2,230,230],
            ['Major Customer 2',123,1,130,30],
            ['Major Customer 2',123,2,20,10],
            ['Total Customers',456,1,500,400],
            ['Total Customers',456,2,400,320],
            ['Major Customer 1',456,1,100,220],
            ['Major Customer 1',456,2,230,230],
            ['Major Customer 2',456,1,130,30],
            ['Major Customer 2',456,2,20,10]])

            df =pd.DataFrame(data)

我希望根据“市场”列(总客户)中的行值与“市场”列(主要客户 1 + 主要客户 2)中的行值之间的值差异创建新行。我希望将“市场”列中的新行值分配为“剩余客户”并附加在同一个 df 中。

总的来说,我基本上是在尝试计算市场剩余的销售额和单位“差距”

这是我迄今为止使用 loc 尝试过的,但我不断收到一个关键错误。任何人都可以帮忙吗?

 df.loc[df['Market'] == 'Remaining Customers'] =          
                        df.loc[df['Market'] == 'Total Customers']-
                        (df.loc[df['Market'] == 'Major Customer 1']+df.loc[df['Market'] == 'Major Customer 2'])
4

1 回答 1

1

有关详细信息,请参阅此笔记本。 https://nbviewer.jupyter.org/github/emican86/48999037/blob/master/48999037.ipynb

.loc 主要是基于标签的。必须对齐数据并设置标签。

于 2018-02-27T02:28:48.470 回答