python - 基于两个条件合并数据

Question

我有四列数据，我试图根据两个条件进行合并。数据格式如下：

CountyName  Year    Oil Gas
ANDERSON    2010    1358    0
ANDERSON    2010    621746  4996766
ANDERSON    2011    1587    0
ANDERSON    2011    633120  5020877
ANDERSON    2012    55992   387685
ANDERSON    2012    1342    0
ANDERSON    2013    635572  3036578
ANDERSON    2013    4873    0
ANDERSON    2014    656440  2690333
ANDERSON    2014    12332   0
ANDERSON    2015    608454  2836272
ANDERSON    2015    23339   0
ANDERSON    2016    551728  2682261
ANDERSON    2016    12716   0
ANDERSON    2017    132466  567874
ANDERSON    2017    1709    0
ANDREWS 2010    25701725    1860063
ANDREWS 2010    106351  0
ANDREWS 2011    97772   0
ANDREWS 2011    28818329    1377865
ANDREWS 2012    105062  0
...

我有兴趣将重复条目的相应石油值和天然气值组合起来。例如，我想添加安德森县 2010 年的所有石油条目，并让该值仅替换一行中的现有条目。我现在使用的代码是对各个县的所有值求和，无论年份如何，给我一个浓缩的输出，如下所示：

CountyName  Year    Oil Gas
ANDERSON        3954774 
ANDREWS      206472698  
...

这是我正在使用的代码：

import csv
with open('Texas.csv', 'r') as Texas: #opening Texas csv file
    TexasReader = csv.reader(Texas)
    counties = {}
    years = {}

    index = 0 and 1
    for row in TexasReader:
        if index == 0 and 1:
            header = row
        else:
            county = row[0]
            year = row[1]
            oil = row[2]
            gas = row[3]

            if county in counties: 
                counties[county] += int(oil) 
       else:
               counties[county] = int(oil)
        index += 1

    with open('TexasConsolidated.csv', 'w') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=header, delimiter=',', lineterminator='\n')
        writer.writeheader()
        for k, v in counties.items():
            writer.writerow({header[0]: k, header[2]: v})

score 0 · Accepted Answer

这就是你抱怨的那一行：

if county in counties: 
    counties[county] += int(oil)

如果您想要一个dict存储两个键的总和，那么两个值都需要在dict键中。

添加行

counties_years = {}

然后像这样求和，使用元组(县,年)作为键：

if (county,year) in counties_years: 
    counties_years[(county,year)] += int(oil) 
else:
    counties_years[(county,year)] = int(oil)

python - 基于两个条件合并数据

1 回答 1

Related

Reference