1

I'm still learning Python and I still don't know how to work with arrays.

I'm want to read a tab-delimited file as for example (but in my case I will have about 400 rows):

  Col1      Col2
  0.0001    0.6
  0.0001    0.5
  0.000006  0.8
  0.0001    0.0003
  0.002     1
  0.002     3

I want to get the following output:

Col1      Col2
0.0001    0.36676667
0.000006  0.8
0.002     2

So I wanna keep the same value in Col1 but taking the mean of values in Col2 which corresponds to the same value in Col1.

I can read an array using :

  arr = np.genfromtxt('test.csv', dtype=None, delimiter='\t', skiprows=1)

but I don't know how to make these operations and making a new file with the new generated data.

Thanks a lot for any help!

4

1 回答 1

1

使用collections.defaultdictwithlist作为默认参数。

将第一列中的值作为键,并附加第二个值。

import csv
from collections import defaultdict

# Gather the data from the CSV file
d = defaultdict(list)
with open('data.csv', 'r') as csvfile:
    reader = csv.reader(csvfile, delimiter='\t')
    for row in reader:
        d[float(row[0])].append(float(row[1]))

# Print the mean.
for k in d.keys():
    print k, sum(d[k])/len(d[k])
于 2013-07-23T20:58:42.857 回答