0

Say we are reading data from some source with multiple key-value pairs. Let's use the following list as an example:

[{'key0': 'key0_value0', 'key1': 'key1_value0'},
 {'key0': 'key0_value1', 'key1': 'key1_value1'}]

Reading the first item from that list should result in a CSV looking like this:

key_header | 0
---------------------------
key0       | key0_value_0
key1       | key1_value_0

Reading the second item should now result in the following:

key_header | 0            | 1
----------------------------------------
key0       | key0_value_0 | key0_value_1
key1       | key1_value_0 | key1_value_1

This goes on horizontally until until. The algorithm to write this is beyond me, and I am not sure if the csv module will work since it appears to assume data will be written a row at a time.

4

2 回答 2

5

You'll have to first collect all your 'columns', then write. You can do that by converting everything to a list of lists, then use zip(*columns) to transpose the list of columns to a list of rows:

columns = [['key_header'] + sorted(inputlist[0].keys())]  # first column

for i, entry in enumerate(inputlist):
    columns.append([i] + [entry[k] for k in columns[0][1:]])

with open(outputfilename, 'wb') as output:
    writer = csv.writer(output)
    writer.writerows(zip(*columns))

Demo showing the row output:

>>> from pprint import pprint
>>> inputlist = [{'key0': 'key0_value0', 'key1': 'key1_value0'},
...  {'key0': 'key0_value1', 'key1': 'key1_value1'}]
>>> columns = [['key_header'] + sorted(inputlist[0].keys())]  # first column
>>> for i, entry in enumerate(inputlist):
...     columns.append([i] + [entry[k] for k in columns[0][1:]])
... 
>>> pprint(zip(*columns))
[('key_header', 0, 1),
 ('key0', 'key0_value0', 'key0_value1'),
 ('key1', 'key1_value0', 'key1_value1')]
于 2013-07-31T18:43:11.197 回答
1

There is no way to write columns progressively, because that's not how text files (which CSV files are a subset of) work. You can't append to a line/row in the middle of a file; all you can do is append new lines at the end.

However, I'm not sure why you need to do this anyway. Just transpose the list in-memory, then write it row by row.

For example:

values = [{'key0': 'key0_value0', 'key1': 'key1_value0'},
          {'key0': 'key0_value1', 'key1': 'key1_value1'}]
transposed = zip(*(x.items() for x in values))
grouped = ([pairs[0][0]] + [pair[1] for pair in pairs] for pairs in transposed)
writer.writerows(grouped)

Just transposing the items isn't quite enough, because you end up with a copy of key0 for each value, instead of just one copy. That's what the grouped is for.

于 2013-07-31T18:42:39.477 回答