csvreader
文档说:
... csvfile 可以是任何支持迭代器协议并在每次调用其 next() 方法时返回一个字符串的对象...
因此,对 OP 的原始代码进行了小改动:
import csv
import os
filename = "tar.data"
with open(filename, 'rb') as csvfile:
spamreader = csv.reader(csvfile)
justtesting = csvfile.tell()
size = os.fstat(csvfile.fileno()).st_size
for row in spamreader:
pos = csvfile.tell()
print pos, "of", size, "|", justtesting
###############################################
def generator(csvfile):
# readline seems to be the key
while True:
line = csvfile.readline()
if not line:
break
yield line
###############################################
print
with open(filename, 'rb', 0) as csvfile:
spamreader = csv.reader(generator(csvfile))
justtesting = csvfile.tell()
size = os.fstat(csvfile.fileno()).st_size
for row in spamreader:
pos = csvfile.tell()
print pos, "of", size, "-", justtesting
对我的测试数据运行它会得到以下结果,表明两种不同的方法会产生不同的结果。
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
224 of 224 | 0
16 of 224 - 0
32 of 224 - 0
48 of 224 - 0
64 of 224 - 0
80 of 224 - 0
96 of 224 - 0
112 of 224 - 0
128 of 224 - 0
144 of 224 - 0
160 of 224 - 0
176 of 224 - 0
192 of 224 - 0
208 of 224 - 0
224 of 224 - 0
我在上设置了零缓冲,open
但没有任何区别,事情readline
在生成器中。