我有一个数据框df1:
Plant name Brand Region Units produced capacity Cost incurred
Gujarat Plant Hyundai Asia 8500 9250 18500000
Haryana Plant Honda Asia 10000 10750 21500000
Chennai Plant Hyundai Asia 12000 12750 25500000
Zurich Plant Volkswagen Europe 25000 25750 77250000
Chennai Plant Suzuki Asia 6000 6750 13500000
Rengensburg BMW Europe 12500 13250 92750000
Dingolfing Mercedes Europe 14000 14750 103250000
我想要一个具有以下格式的输出数据框:
df2= Region BMW Mercedes Volkswagen Toyota Suzuki Honda Hyundai
Europe
North America
Asia
Oceania
其中每个单元格的内容等于sum(cost incurred) / sum(units produced)那个特定的Region和Brand。
我尝试过的代码导致ValueError:
for i,j in itertools.zip_longest(range(len(df2),range(len(df2.columns)):
if (df2.index[i] in list(df1["Region"]) & df2.columns[j] in list(df1["Brand"])==True:
temp1 = df1["Region"]==df2.index[i]
temp2 = df1["Brand"]==df2.columns[j]]
df2.loc[df2.index[i],df2.columns[j]] = df1(temp1&temp2)["Cost incurred"].sum()/
df1(temp1&temp2)["Units Produced"].sum()
elif (df2.index[i] in list(df1["Region"]) & df2.columns[j] in list(df1["Brand"])==False:
df2.loc[df2.index[i],df2.columns[j]] = 0
ValueError:具有多个元素的数组的真值不明确。使用 a.any() 或 a.all()