我有一个“不那么”的大文件(~2.2GB),我正在尝试读取和处理......
graph = defaultdict(dict)
error = open("error.txt","w")
print "Reading file"
with open("final_edge_list.txt","r") as f:
for line in f:
try:
line = line.rstrip(os.linesep)
tokens = line.split("\t")
if len(tokens)==3:
src = long(tokens[0])
destination = long(tokens[1])
weight = float(tokens[2])
#tup1 = (destination,weight)
#tup2 = (src,weight)
graph[src][destination] = weight
graph[destination][src] = weight
else:
print "error ", line
error.write(line+"\n")
except Exception, e:
string = str(Exception) + " " + str(e) +"==> "+ line +"\n"
error.write(string)
continue
难道我做错了什么??
它已经像一个小时..因为代码正在读取文件..(它仍在阅读..)
并且跟踪内存使用量已经是 20GB.. 为什么要花这么多时间和内存?