我有以下可重现的代码,我在其中创建了一个字典,我将它按大都会区这一因子分组,并使用该agg()函数按因子确定平均值:
dictionaryMLB = {'Metropolitan area': ['New York City','New York City','Los Angeles', 'Los Angeles', 'San Francisco Bay Area','San Francisco Bay Area','Chicago','Chicago'],
'Population (2016 est.)[8]': [20153634, 20153634, 13310447, 13310447,6657982,6657982,9512999,9512999],
'MLB':['Yankees','Mets','Dodgers','Angels','Giants','Athletics','Cubs','White Sox']}
df = pd.DataFrame(dictionaryMLB)
df.groupby('Metropolitan area').agg([np.mean])
我的输出如下:
Population (2016 est.)[8]
mean
Metropolitan area
Chicago 9512999
Los Angeles 13310447
New York City 20153634
San Francisco Bay Area 6657982
我想避免列中的双重名称,而只保留Population (2016 est.)[8]或mean获取例如以下内容:
mean
Metropolitan area
Chicago 9512999
Los Angeles 13310447
New York City 20153634
San Francisco Bay Area 6657982
我应该如何进行?