python - Python csv reader-zipping reader with range

Question

我有一个非常简单的这种类型的 csv 文件（例如，我放了斐波那契数）：

我只是想以下列方式批量处理行（fib 数字无关紧要）

import csv
b=0
s=1
i=1
itera=0
maximum=10000
bulk_save=10
csv_file='really_simple.csv'
fo = open(csv_file)
reader = csv.reader(fo)
##Skipping headers
_headers=reader.next()

while (s>0) and itera<maximum:
    print 'processing...'
    b+=1
    tobesaved=[]
    for row,i in zip(reader,range(1,bulk_save+1)): 
        itera+=1
        tobesaved.append(row)
        print itera,row[0]    
    s=len(tobesaved)        
    print 'chunk no '+str(b)+' processed '+str(s)+' rows'  
print 'Exit.'

我得到的输出有点奇怪（好像读者在循环结束时省略了一个条目）

processing...
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
chunk no 1 commited 10 rows
processing...
11 12
12 13
13 14
14 15
15 16
16 17
17 18
18 19
19 20
20 21
chunk no 2 commited 10 rows
processing...
21 23
22 24
23 25
24 26
25 27
chunk no 3 commited 5 rows
processing...
chunk no 4 commited 0 rows
Exit.

你知道问题可能是什么吗？我的猜测是 zip 功能。

我有这样的代码（获取数据块）的原因是我需要将批量 csv 条目保存到 sqlite3 数据库（使用 executemany 并在每个 zip 循环结束时提交，这样我就不会超载我的内存。谢谢！

score 2 · Accepted Answer

尝试以下操作：

import csv

def process(rows, chunk_no):
    for no, data in rows:
        print no, data
    print 'chunk no {} process {} rows'.format(chunk_no, len(rows))

csv_file='really_simple.csv'
with open(csv_file) as fo:
    reader = csv.reader(fo)
    _headers = reader.next()

    chunk_no = 1
    tobesaved = []
    for row in reader:
        tobesaved.append(row)
        if len(tobesaved) == 10:
            process(tobesaved, chunk_no)
            chunk_no += 1
            tobesaved = []
    if tobesaved:
        process(tobesaved, chunk_no)

印刷

1 1
2 1
3 2
4 3
5 5
6 8
7 13
8 21
9 34
10 55
chunk no 1 process 10 rows
11 89
12 144
13 233
14 377
15 610
16 987
17 1597
18 2584
19 4181
20 6765
chunk no 2 process 10 rows
21 10946
22 17711
23 28657
24 46368
25 75025
26 121393
27 196418
chunk no 3 process 7 rows

python - Python csv reader-zipping reader with range

1 回答 1

Related

Reference