python - python中带有某些参数的json文件中的值总和

Question

我有这个读取 json 文件的 python 脚本，挑选出“销量”最多的汽车（这部分已经完成），但现在我需要弄清楚如何在我们的 json 文件示例中找到销量最高的年份（car_year） 2002 年售出 296 辆汽车，不像 2007 年售出 264 辆汽车，我想出了如何汇总 json 文件的所有“car_sales”，但我需要找到销售额最高的年份

Python脚本：

#!/usr/bin/env python
import json
data = json.load(open('/home/ahmed/events.json'))
#finding the item with the highest sale 
event=max(data, key=lambda ev: ev['total_sales'])
print (event)
#sum the "car_sales" of all the items in the json file
count = sum(map(lambda x: int(x['total_sales']),data))
print (count)

这是 json 文件（一个测试）

[
     {
        "id": 47,
        "car": {
                "car_make": "Lamborghini",
                "car_model": "Murciélago",
                "car_year": 2002
        },
        "price": "$13724.05",
        "total_sales": 149
},
{
        "id": 48,
        "car": {
                "car_make": "volvo",
                "car_model": "x20",
                "car_year": 2010
        },
        "price": "$13724.05",
        "total_sales": 10
},
{
        "id": 49,
        "car": {
                "car_make": "kia",
                "car_model": "kia1.2",
                "car_year": 2007
        },
        "price": "$13724.05",
        "total_sales": 114
},
{
        "id": 50,
        "car": {
                "car_make": "renault",
                "car_model": "p300",
                "car_year": 2002
        },
        "price": "$13724.05",
        "total_sales": 147
},
{
        "id": 51,
        "car": {
                "car_make": "ferrari",
                "car_model": "red",
                "car_year": 2007
        },
        "price": "$13724.05",
        "total_sales": 150
}
        ]

score 1 · Accepted Answer

使用pandas
- 用于pandas.DataFrame.sum获取多个列之一的总和
- 使用Pandas: Boolean Indexing进行矢量化数据选择。
细分df['car.car_year'][df.total_sales == df.total_sales.max()]
- df['car.car_year']选择返回的所需列
  - 用于df所有列
- [df.total_sales == df.total_sales.max()]创建所有行的布尔值，其中total_sales与total_sales.max()
用于pandas.DataFrame.groupby按特定列分组并聚合不同的计算，例如.sum和.max

import pandas as pd
import json

# read the file
data = json.load(open('/home/ahmed/events.json'))

# load into pandas
df = pd.json_normalize(data)

# display(df)
   id      price  total_sales car.car_make car.car_model  car.car_year
0  47  $13724.05          149  Lamborghini    Murciélago          2002
1  48  $13724.05           10        volvo           x20          2010
2  49  $13724.05          114          kia        kia1.2          2007
3  50  $13724.05          147      renault          p300          2002
4  51  $13724.05          150      ferrari           red          2007

# sum of total_sales
df.total_sales.sum()

[out]: 
570

# year of max total_sales
df['car.car_year'][df.total_sales == df.total_sales.max()]

[out]:
4    2007
Name: car.car_year, dtype: int64

# find the total sales per year
dfg = df.groupby('car.car_year', as_index=False).agg({'total_sales': sum})

# display(dfg)
   car.car_year  total_sales
0          2002          296
1          2007          264
2          2010           10

# get the year of max sales
df.groupby('car.car_year', as_index=False)['total_sales'].sum().max()

[out]:
car.car_year    2010
total_sales      296
dtype: int64

python - python中带有某些参数的json文件中的值总和

1 回答 1

Related

Reference